If we use the gradient techniques with ML, then the procedure would be as follows.

**(1)**- Initialize the each HMM
with values generated randomly or using an initialization algorithm
like
*segmental K means*. **(2)**- Take an observation sequence of a sentence and,
- Form the corresponding sentence model using the HMMs of the speech units contained in the sentence.
- Calculate the forward and backward probabilities for the sentence model, using the recursions 1.5 and 1.2.
- Using the equation 1.21 calculate the likelihood of the observations in the sentence model.
- Using the equations 1.26 and 1.29 calculate the gradients wrt all parameters in the sentence model.
- Update parameters in each of the HMMs in the sentence model using the eqn.1.19.

**(3)**- Go to step (2), unless all the observation sequences are considered.
**(4)**- Repeat step(2) to (3) until a convergence criterion is
satisfied.

Fri May 10 20:35:10 MET DST 1996