[Corpora-List] Wonky ngrams

Stephen Grimes sgrimes
Fri Jan 4 13:33:26 CET 2013


Consider the case in which your entire corpus consists of a single sentence:

"In spite of my disdain for the title I really liked the book."

The trigram frequency of "In spite of" is 1/11, while the bigram frequency of "In spite" is somewhat lower, 1/12.

-Steve Grimes

On 1/4/2013 7:04 AM, Brett Reynolds wrote:
> Can anyone explain why "in spite of" would have a higher frequency than
> "in spite" in the following graph from Google ngrams?
> http://goo.gl/u7J3F
>
> -------------------------------------
>
> Brett Reynolds
> English Language Centre
> Humber Institute of Technology and Advanced Learning
> Lakeshore Campus
> Toronto, Ontario
> Phone: 416-675-6622 ex. 3106
>
> brett.reynolds at humber.ca <mailto:brett.reynolds at humber.ca>
>
>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>

-- Stephen Grimes, Ph.D. Linguistic Data Consortium http://www.ldc.upenn.edu/~sgrimes



More information about the Corpora mailing list