[Corpora-List] Kneser-Ney smoothing, unseen context

Mon Jul 22 14:43:04 CEST 2013

Hello Angelina,

This question has been asked in this mailing list a couple of times. Here is a link to one of the previous discussions:

2013/7/22 Angelina Ivanova <angyjune at yandex.ru>

> Hello,
>
> I would like to ask a question about Kneser-Ney smoothing. I write
> formulas as if I was typing them in Latex.
>
>
>
> The formulas are on page 370 in the paper of Chen and Goodman (1999)
>
> http://u.cs.biu.ac.il/~yogo/courses/mt2013/papers/chen-goodman-99.pdf
>
>
>
> Suppose, we are evaluating probability p_{KN}(w_i|w_{i−n+1})
>
>
>
> If we haven't seen the context w_{i-n+1}…w_{i-1} on the training set, the
> divisor is 0 in the first summand and in the y-parameter of the second
> summand.
>
>
>
> I mean the divisor \sum_{w_i} c(w_{i-n+1}^w_i)
>
>
>
> How do we compute the smoothed probability in this case?
>
> a) Should it be zero? (but smoothing is supposed to help to get rid of
> zeros...)
>
> b) Or should it be p_{KN}(w_i|w_{i−n+1}) = p_{KN} (w_i|w_{i−n+2})? But in
> this case we assume that y is 1, which means we put all the weight on the
> ngrams of the lower order…
>
> c) Should we choose some \lambda parameter (what should it be in this
> case? 0.1?) and p_{KN}(w_i|w_{i−n+1}) = \lambda * p_{KN} (w_i|w_{i−n+2})
>
> d) ???
>
>
>
> Thank you!
>
> Angelina
>
> _______________________________________________