[Corpora-List] word frequencies on the web

William Fletcher fletcher at usna.edu
Fri Dec 8 20:08:00 CET 2006


Dear Tony,

I have lists of words occurring 100 or more and 10 or more times
respectively in the preliminary version of a dynamic Web Corpus I am
compiling for "Phrases in English". Since you cannot reach PIE directly, I
put them on my KWiCFinder site:

http://www.kwicfinder.com/WebCorpus2006_min100.html

tab-separated text files
http://www.kwicfinder.com/WebCorpus2006_min100.txt
http://www.kwicfinder.com/WebCorpus2006_min10.txt

Corpus currently has 97,198,272 tokens and 525,509 types, of which 30,524
occur 100 or more times 104,675 tokens occur 10 or more times

Regards,
Bill Fletcher

-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of Tony Berber Sardinha
Sent: Friday, December 08, 2006 11:44 AM
To: CORPORA
Subject: [Corpora-List] word frequencies on the web

Dear all, does anyone know of ways to estimate the frequency of words on the
web, or if there're search engines that supply this info (as Altavista used
to do)?

thank you!
tony
www2.lael.pucsp.br/~tony








More information about the Corpora-archive mailing list