gradient wrt observation probabilities

Next: Maximum Mutual Information (MMI) Up: Gradient based method Previous: gradient wrt transition probabilities

gradient wrt observation probabilities

Using the chain rule,

Differentiating (a time shifted version of) the eqn.1.2 wrt

Finally we get the required probability, by substituting for in eqn.1.22 (keeping in mind that in this case), which is obtained by substituting eqns.1.28 and 1.24 in eqn.1.27.

Usually this is given the following form, by first substituting for from eqn.1.21 and then substituting from eqn.1.14.

If the continuous densities are used then can be found by further propagating the derivative using the chain rule.
The same method can be used to propagate the derivative (if necessary) to a front end processor of the HMM. This will be discussed in detail later.

Narada Warakagoda
Fri May 10 20:35:10 MET DST 1996

Home Page