A wider window in framing

Next: A non-restricted Recurrent Net Up: Suggestions for further work Previous: Training procedure which preserves

A wider window in framing

The idea of wider window has been described in section in relation with eqns. and . We use in this case for example 3 times longer input speech segments than the usual frames. Then the front end neural network which performs the windowing and Hartley (or Fourier) transform will generate 3 vectors which are now fed to the filter bank as if they are consecutive vectors. The idea is that each of these three vectors have been generated within a context of longer time interval. This is in principle the same procedure of contextual information extraction in [, ]. The difference of the two methods is however the location at which the contextual information is extracted. In the suggested method it is done at the very beginning while in the other method it is done after pre-processing. Contextual information extraction at an early point has the advantage that we can make use of all the information available from the original speech signal. At a later point however we have only a signal which has lost its information due to the reduction of dimensionality. Therefore better results can be expected from the suggested method. The problem of this approach is however the network size which may be very large. Any way, a system which accepts a 3 times longer window than a 400 sample-frame, can be implemented on a usual SUN work station.

Narada Warakagoda
Fri May 10 20:35:10 MET DST 1996

Home Page