Re: Speech to text



agalkin@xxxxxxxxxxx writes:

Thank you for your reponse. Actually what I want is quite simple. I
want to take audio and create a time stamp table of values where words
start. For instance the first word start at 250 ms, the second word on
700 ms, the third word on 1.5 s, etc. I do have access to both audio
and corresponding text.

As I said before, this may be easy or hard depending on how long your
audio is (10s is easy to align, several hours are not) and what the
acoustic condition are (non-speech confuses things). Also I forgot to
mention that often the text and the audio differ and that might throw
things off (e.g. newsreaders sometimes don't say exactly what is on the
prompt, go off prompt for a bit, books have chapter titles that aren't
spoken, and subtitles are contracted versions of what was spoken).

HTK has a good tutorial which, if you follow it through, will give you
the result you want - and there is a mailing list if you get stuck.
You'll also need a pronunciation dictionary (e.g. cmudict for American
English, BEEP for British English) and some time to learn it all.

Good luck.


Tony
.



Relevant Pages

  • Re: Question about Audio tracks
    ... More a quesiton about synchronization. ... If the audio time stamp is natively different from the video time stamp, ... > of how I might add an audio track to an existing file original.asf: ...
    (microsoft.public.windowsmedia.sdk)
  • Re: Speech to text
    ... want to take audio and create a time stamp table of values where words ... For instance the first word start at 250 ms, ... I do have access to both audio ... Using either CMU Sphinx, DTW with synthetic speech, or EHMM in the ...
    (comp.speech.research)
  • Re: Speech to text
    ... want to take audio and create a time stamp table of values where words ... I do have access to both audio ... although it involved a lot more than straight speech to text alignment. ...
    (comp.speech.research)
  • Re: cmd problem
    ... Yes my audio is working but I need the volume control for a project that I'm ... And when you enter cd i386, your prompt should change to this ... is your audio working in Windows Media ... instructions I should have a space to enter the proper info. ...
    (microsoft.public.windowsxp.basics)
  • Re: Hi-Phone contact
    ... but of course laughing through the tears when you need ... If I play a loud tone ... adjust the SR detection level and reduce the volume of outgoing audio. ... > in" and then stifles its prompt to be polite. ...
    (microsoft.public.win32.programmer.tapi)