[Corpora-List] Cross-document coreference/Entity Resolution: $50,000 Spock Challenge

Eric Atwell eric at comp.leeds.ac.uk
Thu Apr 19 10:02:00 CEST 2007

Thanks for telling us what is in the download file, without having to
download it! - 97000 files (9Gb) of raw HTML, which contestants first
have to "clean" themselves before they can try any fancy NLP stuff.

A group of European reseachers from Trento and Leeds have launched
CLEANEVAL, another contest to build tidy tools for web-as-corpus
research, see http://cleaneval.sigwac.org.uk/ - This could be a useful
first-step for anyone trying the spock challenge; also, any spock
contestants could also enter their tidy-tool in the CLEANEVAL contest!

Eric Atwell, Leeds University

-----Original Message-----

> From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On

> Behalf Of Alexandre Rafalovitch

> Sent: Wednesday, April 18, 2007 10:07 PM

> To: CORPORA at uib.no

> Subject: Re: [Corpora-List] Cross-document coreference/Entity Resolution:

> $50,000 Spock Challenge


> The website is rather sparse on information at the moment, so I have

> downloaded their (rather large) corpora and had a look.


> If anyone is interested in the challenge, my overview might help you

> to make a decision better and faster:

> http://blog.outerthoughts.com/2007/04/spock-announces-an-entity-resolution-c

> ompetition/


> Hope it helps,

> Alex.


More information about the Corpora-archive mailing list