Training
Next: recognition Up: Use of HMMs in Previous: Use of HMMs in

Training

We assume that the preprocessing part of the system gives out a sequence of observation vectors

Starting from a certain set of values, parameters of each of the HMMs

can be updated as given by the eqn.1.19, while the required gradients will be given by eqns. 1.44 and 1.48. However for this particular case, isolated recognition, likelihoods in the the last two equations are calculated in a peculiar way.
First consider the clamped case. Since we have an HMM for each class of units in isolated recognition, we can select the model of the class l to which the current observation sequence belongs. Then starting from eqn. 1.39 ,

where the second line follows from eqn.1.3.

Similarly for the free case, starting from eqn. 1.40,

where represents the likelihood of the current observation sequence belonging to class l, in the model . With those likelihoods defined in eqns.1.52 and 1.53, the gradient giving equations 1.44 and 1.48 will take the forms,

Now we can summarize the training procedure as follows.

(1)
Initialize the each HMM, with values generated randomly or using an initialization algorithm like segmental K means [].
(2)
Take an observation sequence and
• Calculate the forward and backward probabilities for each HMM, using the recursions 1.5 and 1.2.
• Using the equations 1.52 and 1.53 calculate the likelihoods
• Using the equations 1.54 and 1.55 calculate the gradients wrt parameters for each model
• Update parameters in each of the models using the eqn.1.19.
(3)
Go to step (2), unless all the observation sequences are considered.
(4)
Repeat step(2) to (3) until a convergence criterion is satisfied.

This procedure can easily be modified if the continuous density HMMs are used, by propagating the gradients via chain rule to the parameters of the continuous probability distributions. Further it is worth to mention that preprocessors can also be trained simultaneously, with such a further back propagation.

Next: recognition Up: Use of HMMs in Previous: Use of HMMs in