Re: Speech to text
- From: tony.nospam@xxxxxxxxxxxxxxxxxxxxxxx
- Date: Wed, 14 Jun 2006 09:39:12 GMT
agalkin@xxxxxxxxxxx writes:
Thank you for your reponse. Actually what I want is quite simple. I
want to take audio and create a time stamp table of values where words
start. For instance the first word start at 250 ms, the second word on
700 ms, the third word on 1.5 s, etc. I do have access to both audio
and corresponding text.
As I said before, this may be easy or hard depending on how long your
audio is (10s is easy to align, several hours are not) and what the
acoustic condition are (non-speech confuses things). Also I forgot to
mention that often the text and the audio differ and that might throw
things off (e.g. newsreaders sometimes don't say exactly what is on the
prompt, go off prompt for a bit, books have chapter titles that aren't
spoken, and subtitles are contracted versions of what was spoken).
HTK has a good tutorial which, if you follow it through, will give you
the result you want - and there is a mailing list if you get stuck.
You'll also need a pronunciation dictionary (e.g. cmudict for American
English, BEEP for British English) and some time to learn it all.
Good luck.
Tony
.
- References:
- Speech to text
- From: agalkin
- Re: Speech to text
- From: tony . nospam
- Re: Speech to text
- From: agalkin
- Speech to text
- Prev by Date: Re: Speech to text
- Next by Date: Re: Speech to text
- Previous by thread: Re: Speech to text
- Next by thread: Re: Speech to text
- Index(es):
Relevant Pages
|