[Corpora-List] Query about nomenclature

Andrew Kehoe Andrew.Kehoe at uce.ac.uk
Fri Mar 11 17:41:00 CET 2005


You need to use the search term "ngram -perl" rather than "ngram not
perl" because, as Stefan Evert pointed out, "ngram not perl" just
returns pages containing all 3 of those words.

Another problem with your method is that Google ignores hyphens in
search terms. One of the pages returned for the term "n-gram" is
http://cpan.dei.uc.pt/authors/id/J/JH/JHI/ngram.pl-1.48&e=8092 but this
page does not contain the word "n-gram" at all, only "ngram" without the

Andrew Kehoe
Research and Development Unit for English Studies
School of English
University of Central England, Birmingham


-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of John F. Sowa
Sent: 10 March 2005 01:43
To: Damon Allen Davison
Cc: John Mckenny; CORPORA at HD.UIB.NO
Subject: Re: [Corpora-List] Query about nomenclature

Damon Davison's use of Google inspired me to try
a variation. I just typed three queries and
got the following number of hits:

Search string Hits
------------- ------
ngram 21,100

ngram not perl 540

n-gram 85,700

This seems to provide overwhelming evidence for
a hyphen between "n" and "gram". Since Google
doesn't distinguish capitals, that leaves the
capitalization question unresolved.

John Sowa

More information about the Corpora-archive mailing list