Fw: [Corpora-List] Resend: CLEANEVAL Web-as-Corpus exercise
senta.setinc at triera.net
Wed Apr 4 00:31:00 CEST 2007
----- Original Message -----
From: "Senta Setinc" <senta.setinc at triera.net>
To: "Adam Kilgarriff" <adam at lexmasterclass.com>
Sent: Tuesday, April 03, 2007 11:49 PM
Subject: Re: [Corpora-List] Resend: CLEANEVAL Web-as-Corpus exercise
> Forgive me for sending you this information, which is much, much less
> important than Adam's : For those who have experienced problems with
> overload on their harddisks, there is a wonderful new (not for all of you,
> am sure) cleaning software - a free tool, named CClenaer (Crap Cleaner).
> can download it from here: http://www.ccleaner.com
> In only a matter of seconds I gained about 1,3 Giga Bytes of free space.
> Amazing, really.
> All the best to all, Senta
> ----- Original Message -----
> From: "Adam Kilgarriff" <adam at lexmasterclass.com>
> To: <sigwac at sslmit.unibo.it>; <corpora at hd.uib.no>
> Sent: Tuesday, April 03, 2007 6:56 PM
> Subject: [Corpora-List] Resend: CLEANEVAL Web-as-Corpus exercise
> **Apologies for faulty links in last version**
> CLEANEVAL is a shared task and competitive evaluation for cleaning
> web pages, with the goal of preparing web data for use as a corpus, for
> linguistic and language technology research and development. You are
> invited to participate, and to encourage others to do so too.
> Website: http://cleaneval.sigwac.org.uk
> Development dataset now available.
> * Prizes! A prize of £250.00 (GBP) will be awarded for the best
> student entrant for each task (Chinese and English).
> * Timetable:
> * March 2007: Development datasets released (English and Chinese)
> * June 2007: Exercise: Evaluation dataset released and, two weeks
> later, participants to return cleaned pages
> * end June 2007: Papers describing systems to be submitted
> * Sept 15-16 2007: Workshop, part of WAC3, Louvain-la-Neuve, Belgium
> * Co-ordinators
> * Marco Baroni, Trento University, Italy
> * Tony Hartley, Leeds University, UK
> * Adam Kilgarriff, Lexical Computing Ltd., Leeds and Sussex Univs, UK
> * Serge Sharoff, Leeds University, UK
> CLEANEVAL is an activity of ACL-SIGWAC, the Association for Computational
> Linguistics (ACL) Special Interest Group on Web as Corpus.
More information about the Corpora-archive