To describe the MMI training procedure, we need the appropriate forms of eqns. 1.39 and 1.40.

First consider the *clamped* case. Since we are working in the
sentence level, we form an HMM, (by interconnecting the HMMs of speech
units) which represents a class *l* of sentences to which the current observation sequence belongs. Then starting from eqn.
1.39 ,

where the second line follows from eqn.1.3.

For the *free* case, we have only one HMM, which represents the whole
language. Therefore summing up over all the alternative classes will be
equivalent to summing up over the whole set of states. Thus the eqn.
1.40, takes following form,

Now the MMI procedure will be as follows.

**(1)**- Initialize the each HMM
with values generated randomly or using an initialization algorithm.
**(2)**- Take an observation sequence of a sentence and,
- Form the corresponding sentence model using the HMMs of the speech units contained in the sentence.
- Calculate the forward and backward probabilities for the sentence model, using the recursions 1.5 and 1.2.
- Using the equations 1.56 and 1.57 calculate the likelihood of the observations in the sentence model, and in the language model,
- Using the equations 1.44 and 1.48
calculate the gradients wrt parameters in all the
speech unit models. in the equations is
replaced by a new quantity
- Update parameters in each of the HMMs in the language model using the eqn.1.19.

**(3)**- Go to step (2), unless all the observation sequences are considered.
**(4)**- Repeat step(2) to (3) until a convergence criterion is
satisfied.

Fri May 10 20:35:10 MET DST 1996