[Corpora-List] problems with Google counts

Jim Breen Jim.Breen at infotech.monash.edu.au
Fri Mar 18 01:21:12 CET 2005

[This is a repeat attempt. I posted this 3 days ago, and (a) it never
appeared, and (b) I never received another copy of the corpora-digest,
although I see from the WWW page that there have been more postings. Did
I offend the server?]

Matthew Hurst <mhurst_AT_intelliseek.com> wrote:


>> As for Lillian's original post, I notice that Google's language classifier,

>> at least for Japanese, is not very good...

What sorts of problems are you encountering? I used to include
the hiragana "no" in all Google requests to prevent Chinese pages
being picked up, but these days the language setting seems to get the
same outcome.



Jim Breen http://www.csse.monash.edu.au/~jwb/
Computer Science & Software Engineering, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学

More information about the Corpora-archive mailing list