This question has been asked in this mailing list a couple of times. Here is a link to one of the previous discussions:
http://mailman.uib.no/public/corpora/2012-June/015774.html
Kind regards, Wladimir
2013/7/22 Angelina Ivanova <angyjune at yandex.ru>
> Hello,
>
> I would like to ask a question about Kneser-Ney smoothing. I write
> formulas as if I was typing them in Latex.
>
>
>
> The formulas are on page 370 in the paper of Chen and Goodman (1999)
>
> http://u.cs.biu.ac.il/~yogo/courses/mt2013/papers/chen-goodman-99.pdf
>
>
>
> Suppose, we are evaluating probability p_{KN}(w_i|w_{i−n+1})
>
>
>
> If we haven't seen the context w_{i-n+1}…w_{i-1} on the training set, the
> divisor is 0 in the first summand and in the y-parameter of the second
> summand.
>
>
>
> I mean the divisor \sum_{w_i} c(w_{i-n+1}^w_i)
>
>
>
> How do we compute the smoothed probability in this case?
>
> a) Should it be zero? (but smoothing is supposed to help to get rid of
> zeros...)
>
> b) Or should it be p_{KN}(w_i|w_{i−n+1}) = p_{KN} (w_i|w_{i−n+2})? But in
> this case we assume that y is 1, which means we put all the weight on the
> ngrams of the lower order…
>
> c) Should we choose some \lambda parameter (what should it be in this
> case? 0.1?) and p_{KN}(w_i|w_{i−n+1}) = \lambda * p_{KN} (w_i|w_{i−n+2})
>
> d) ???
>
>
>
> Thank you!
>
> Angelina
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 2425 bytes
Desc: not available
URL: <https://mailman.uib.no/public/corpora/attachments/20130722/2a456124/attachment.txt>