and we are interested in the probability, This can be approximated by

which is called an *N-gram language model.* These probabilities
can be evaluated by occurrence counting in the data base. However this
is clearly impractical except for the case *N*=2 and possibly for
*N*=3. In the case of *N*=2 we call it a *bigram language model*
and this is the most widely used model. An extreme case of bigram
model called *word pair grammar* can be obtained by quantizing
the bigram probabilities between 1 and 0. But the most relaxed grammar
is *no grammar* where any speech unit can be followed by every
speech unit in the vocabulary.

Depending on the language model used, difficulty of recognition can
vary. In order to express this quantity mathematically we define a
measure called *perplexity* *B* defined as,

For completely specifying the sophistication of a continuous speech recognizer, the perplexity of the assumed language model is needed, in addition to the recognition rate etc.

Fri May 10 20:35:10 MET DST 1996