We have a model and a sequence of observations
, and must be found. We can calculate this quantity using
simple probabilistic arguments. But this calculation involves number of
operations in the order of . This is very large even if the
length of the sequence, T is moderate. Therefore we have to look
for an other method for this calculation. Fortunately there exists
one which has a considerably low complexity and makes use an
auxiliary variable, called forward variable.
The forward variable is defined as the probability of the partial observation sequence , when it terminates at the state i. Mathematically,
Then it is easy to see that following recursive relationship holds.
where,
Using this recursion we can calculate
and then the required probability is given by,
The complexity of this method, known as the forward algorithm is proportional to , which is
linear wrt T whereas the direct calculation mentioned earlier, had an
exponential complexity.
In a similar way we can define the backward variable as the probability of the partial observation sequence , given that the current state is i. Mathematically ,
As in the case of there is a recursive relationship which can be used to calculate efficiently.
where,
Further we can see that,
Therefore this gives another way to calculate , by using both forward and backward variables as given in eqn. 1.7.
Eqn. 1.7 is very useful, specially in deriving the formulas required for gradient based training.