Gradient based method next up previous
Next: gradient wrt transition probabilities Up: Maximum Likelihood (ML) criterion Previous: Baum-Welch Algorithm

Gradient based method

In the gradient based method, any parameter tex2html_wrap_inline2856 of the HMM tex2html_wrap_inline2676 is updated according to the standard formula,


where J is a quantity to be minimized. We define in this case,


Since the minimization of tex2html_wrap_inline2864 is equivalent to the maximization of tex2html_wrap_inline2808 , eqn.1.19 yields the required optimization criterion, ML. But the problem is to find the derivative tex2html_wrap_inline2868 for any parameter tex2html_wrap_inline2856 of the model. This can be easily done by relating J to model parameters via tex2html_wrap_inline2808 . As a key step to do so, using the eqns.1.7 and 1.9 we can obtain,


Differentiating the last equality in eqn. 1.20 wrt an arbitrary parameter tex2html_wrap_inline2856 ,


Eqn.1.22 gives tex2html_wrap_inline2868 , if we know tex2html_wrap_inline2882 which can be found using eqn.1.21. However this derivative is specific to the actual parameter concerned. Since there are two main parameter sets in the HMM, namely transition probabilities tex2html_wrap_inline2884 and observation probabilities tex2html_wrap_inline2886 , we can find the derivative tex2html_wrap_inline2882 for each of the parameter sets and hence the gradient, tex2html_wrap_inline2868 .

Narada Warakagoda
Fri May 10 20:35:10 MET DST 1996

Home Page