and we are interested in the probability, This can be approximated by
which is called an N-gram language model. These probabilities
can be evaluated by occurrence counting in the data base. However this
is clearly impractical except for the case N=2 and possibly for
N=3. In the case of N=2 we call it a bigram language model
and this is the most widely used model. An extreme case of bigram
model called word pair grammar can be obtained by quantizing
the bigram probabilities between 1 and 0. But the most relaxed grammar
is no grammar where any speech unit can be followed by every
speech unit in the vocabulary.
Depending on the language model used, difficulty of recognition can
vary. In order to express this quantity mathematically we define a
measure called perplexity B defined as,
For completely specifying the sophistication of a continuous speech recognizer, the perplexity of the assumed language model is needed, in addition to the recognition rate etc.