gradient wrt observation probabilities next up previous
Next: Maximum Mutual Information (MMI) Up: Gradient based method Previous: gradient wrt transition probabilities

gradient wrt observation probabilities

Using the chain rule,

  equation597

Differentiating (a time shifted version of) the eqn.1.2 wrt tex2html_wrap_inline2908

  equation613

Finally we get the required probability, by substituting for tex2html_wrap_inline2914 in eqn.1.22 (keeping in mind that tex2html_wrap_inline2916 in this case), which is obtained by substituting eqns.1.28 and 1.24 in eqn.1.27.

  equation634

Usually this is given the following form, by first substituting for tex2html_wrap_inline2808 from eqn.1.21 and then substituting from eqn.1.14.

  equation651

If the continuous densities are used then tex2html_wrap_inline2928 can be found by further propagating the derivative tex2html_wrap_inline2930 using the chain rule.
The same method can be used to propagate the derivative (if necessary) to a front end processor of the HMM. This will be discussed in detail later.



Narada Warakagoda
Fri May 10 20:35:10 MET DST 1996

Home Page