[Corpora-List] problems with Google counts

Matthew Hurst mhurst at intelliseek.com
Wed Mar 16 15:52:00 CET 2005

I did a simple search for my family blog with the name of the blog and nary a
Japanese page appeared on the first list of results. It could indicate that the
blog isn't indexed and that the behaviour of google is to back off to pages that
at least include the term. If so, I would say that that was the wrong behaviour.


Jim Breen wrote:

> Matthew Hurst <mhurst_AT_intelliseek.com> wrote:


>>>As for Lillian's original post, I notice that Google's language classifier,

>>>at least for Japanese, is not very good...



> What sorts of problems are you encountering? I used to include

> the hiragana "no" in all Google requests to prevent Chinese pages

> being picked up, but these days the language setting seems to get the

> same outcome.


> Cheers


> Jim


More information about the Corpora-archive mailing list