Re: Need help with time stretching using STFT



On Jun 25, 7:42 pm, Ben B <benb...@xxxxxxxxxxx> wrote:


I am currently writing a program in Visual Basic for time stretching audio
using the Short Time Fourier Transform (STFT) method. I am using this
method because I want to keep both the analysis and synthesis hop sizes at
a fixed rate,

they can be a fixed sizes, but not equal to each other. for time
*stretching* (as opposed to time compression), the analysis hop size
is shorter than the synthesis hop size.

unlike SOLA where the synthesis hop size can vary slightly.

My main algorithm so far successfully steps through and reads in a block of
data from the input buffer, applies a linear window function,

"linear window"? what's that?

then overlap-
adds with a 50% overlap at a Ss of alpha x Sa ... and so on until it
reaches the end of the input buffer.

After a lot of research, I now know that after a block of data has been
windowed that I then need to convert it to the frequency domain with a FFT,
do some kind of frequency processing, do an IFFT, then overlap-add. This is
what I'm stuck on at the moment.

your friends are Google and maybe Wikipedia. need to look up "phase
vocoder". that's what you are trying to do.

What I need to know is once a windowed block of data has been converted to
the frequency domain, what do I actually do with it (in terms of time
stretching), before I convert it back again for overlap-adding?

you need to identify (in the frequency domain) what are sinusoidal
components in the time domain. if it weren't for windowing, these
would look like spikes in the spectrum. but because of windowing,
they're bumps. you isolate each bump and identify what the likely
frequency of the sinusoid that (together with the window function)
created that bump.

then knowing what the frequency is, if the previous frame has a bump
of a similar frequency, you want to adjust the phase of the bump of
the current frame (by multiplying it by a complex exponential,
e^(j*theta)), so that the phase of that particular sinusoid in the
present frame lines up well with the phase of the corresponding
sinusoid in the previous frame (after it was adjusted). that way,
when you overlap-add it to the tail of the previous frame, there is no
destructive interference.

to make this sound good (not spread out the transitions into mush),
there are other tricks to look into without my help.

I
understand this will help keep the pitch/phase correct somehow.

just like the SOLA, you are splicing. but in SOLA, the amount that
each sinusoid is shifted in phase is proportional to the frequency of
that sinusoid times a *common* time displacement (that is the same
value for *all* sinusoids in the frame). the only knob you are able to
adjust is the value of that common time displacement. if the input
signal is monophonic or, maybe a better term: quasi-periodic (i.e. a
single tone with harmonic overtones), then you can find a single time
displacement to line up all of the frequency components of the current
frame to their counterparts in the previous frame. but if you're
input is *not* quasi-periodic, then there is no single time
displacement that you can apply to the analysis frame (which changes
the phase of all of the frequency components) that will make every
frequency component happy about the splice. if any sinusoids of
significant amplitude are not spliced with their phases lined up,
glitches will happen.

with the phase vocoder (or with sinusoidal modeling, in my opinion,
the two frequency-domain methods are moving toward each other and
becoming the same) you can apply a different time displacement (same
as a phase adjustment) to each frequency component, independent of the
time displacement you apply to other sinusoids. but when that happens
and different time displacements are applied to different frequencies,
then the shape of the waveform changes. sometimes that is audible
(really only with transitions or attacks), and most of the time it is
not.

Also, less
importantly, what is the best type of window to use for STFT? I'm currently
using a Bartlett window function.

oh so that's what you mean by a "linear window function". a
triangular window. might be okay. Hann window is probably better.

r b-j
.



Relevant Pages

  • Re: persistent TCP connection over page reloads ?
    ... say a frame or another window. ... The global execution context: no. ... globalStorage & userData (this is very unlikely to work, ...
    (comp.lang.javascript)
  • RE: multiple modal dialogs
    ... There are two different ways of showing dialogs: ... Modal dialog (and any window for that matter) runs on the top of all ... While modal dialog relies on own message loop modeless relies on application ... that these frame windows can have there own modal ...
    (microsoft.public.vc.mfc)
  • TCP/IP connections problems on Win2003 Entreprise Server
    ... Time since reference or first frame: ... Flags: 0x04 (Don't Fragment) ... Header checksum: 0x16eb ... Window scale: 0 ...
    (microsoft.public.windows.server.networking)
  • Re: [9fans] acme
    ... or so) versions of Acme. ... that frame in case it is visible, ... both tagline, and regular frame. ... but there are window layout problems. ...
    (comp.os.plan9)
  • Re: OT: Low-E glass
    ... Frame and all for a Dining room 2" x 2' window it's $199... ... and Argon it's $135, no Argon is $122. ... Changing out the aluminum frames in the existing double pane windows ...
    (rec.motorcycles.dirt)