[Corpora-List] Re: problems with Google

Marian Olteanu mou_softwin at yahoo.com
Fri Mar 18 09:13:55 CET 2005


Well, the results in Google API were ALWAYS a little bit (or not quite a little) different than
those reported by http://www.google.com/ . You will see a different order for the results, and a
small (or big) difference in counts - what we are interested in.
--- "Deane, Paul" <pdeane at ets.org> wrote:


> Has anybody checked whether the behavior with Google's Web API and its

> standard search is different?

>

> I have code using the Java Web API which makes use of the asterisk to blank

> out a single word (not an unrestricted wildcard.) As of yesterday, when I

> tested the code, it still appeared to be working as designed.

>

> -----Original Message-----

> From: Andrew Kehoe [mailto:Andrew.Kehoe at uce.ac.uk]

> Sent: Thursday, March 17, 2005 9:27 AM

> To: CORPORA at uib.no

> Subject: RE: [Corpora-List] Re: problems with Google

>

>

>

> John

>

> Even if you put double quotes around the wildcard character Google will

> ignore it. When you search for:

>

> "what does "*" mean"

>

> Google is actually searching for 2 'phrases': "what does " and " mean". You

> cannot nest double quotes in Google so the double quotes around the * are

> actually closing your initial quote and beginning a new quote, with the

> wildcard ignored completely.

>

> It may be the case that SOME of the pages Google returns will contain "what

> does", followed by one other word, followed by "mean" but your query does

> not ask for this specifically. Google could (and does) also return pages

> containing "mean" and "what does" in the opposite order, or with multiple

> words in between.

>

> Similarly, "what does "*" "*" mean" is actually searching for 3 'phrases':

> 1) "what does ", 2) " " (a space), and 3)" mean".

>

> So, Google hasn't retained support for wildcards at all I'm afraid, and this

> is why we are developing our own search engine in WebCorp, as Antoinette

> Renouf mentioned yesterday.

>

> Andrew Kehoe

> Research and Development Unit for English Studies

> Univerity of Central England in Birmingham

>

> http://www.webcorp.org.uk/ <http://www.webcorp.org.uk/>

>

> -----Original Message-----

> From: owner-corpora at lists.uib.no on behalf of John Milton

> Sent: Thu 17/03/2005 13:39

> To: CORPORA at uib.no

> Cc:

> Subject: [Corpora-List] Re: problems with Google

>

>

>

> I just discovered that Google seems to have retained some use of the

> wildcard for words if you use double quotes with the asterisk. A search

> for "what does "*" mean" and "what does "*" "*" mean" results MAINLY in

> any one and two words respectively. If anyone else is using web searches

> as language learning/teaching resources, this also looks promising:

> http://www.findforward.com/ <http://www.findforward.com/>

>

> John Milton

> Hong Kong University of Science & Technology

>

>

>

>

>

>

>

>

> **************************************************************************

> This e-mail and any files transmitted with it may contain privileged or

> confidential information. It is solely for use by the individual for whom

> it is intended, even if addressed incorrectly. If you received this e-mail

> in error, please notify the sender; do not disclose, copy, distribute, or

> take any action in reliance on the contents of this information; and delete

> it from your system. Any other use of this e-mail is prohibited. Thank you

> for your compliance.

>

>

>

>



Marian
http://www.utdallas.edu/~mgo031000/



__________________________________
Do you Yahoo!?
Yahoo! Small Business - Try our new resources site!
http://smallbusiness.yahoo.com/resources/





More information about the Corpora-archive mailing list