Generally, the learning problem is how to adjust the HMM parameters,
so that the given set of observations (called the training set)
is represented by the model in the best way for the intended
application. Thus it would be clear that the ``quantity'' we wish to
optimize during the learning process can be different from application
to application. In other words there may be several optimization
criteria for learning, out of which a suitable one is selected
depending on the application.
There are two main optimization criteria found in ASR literature; Maximum Likelihood (ML) and Maximum Mutual Information (MMI). The solutions to the learning problem under each of those criteria is described below.