[Corpora-List] BAWE corpus now archived and available

Lou's Laptop lou.burnard at oucs.ox.ac.uk
Sat Oct 4 09:33:50 CEST 2008

I note the weasel words "non-commercial use" in the agreement Steven quotes. Can't speak for my colleagues in the OTA or in Essex, but my guess is that it's that which is making those archives (or more likely their former funders) anxious: it means they can bounce requests from Microsoft's research department (thus requiring same to apply for copies from their personal e-mail addresses). The world would be a simpler and probably better place if distributors of such resources just accepted that evil commercial people out there making some money out of them might not be such a bad thing.

The suggestion about making a snippet available in advance is a good one; some revisions have been made to the way OTA texts are displayed on the web, and this might be one we could incorporate.

Just my personal opinions!

Steven Bird wrote
> On Sat, Oct 4, 2008 at 1:29 AM, jasper holmes <jasper.holmes at gmail.com> wrote:
>> We are pleased to announce that the British Academic Written English
>> (BAWE) corpus is now available to all researchers ...
>> There are no restrictions on access to the corpus ...
> Except that the UK Data Archive requires users to fill in a web form,
> which leads to:
> "Fax or post a signed copy of this form to: UK Data Archive,
> University of Essex, Wivenhoe Park, Colchester, Essex, CO4 3SQ Fax:
> +44 (0) 1206 872003 Upon receipt of the signed form, we will create
> an Athens account for you within three working days. You will then
> receive an email and will be able to register with ESDS."
> The Oxford Text Archive requires users to fill in a web form, which leads to:
> "Thank you for requesting British Academic Written English Corpus.
> Staff at the Oxford Text Archive need to approve your request before
> granting you access to this resource."
> These steps seem like overkill for a corpus which has generous
> permissions: "Available for non-commercial use on condition that this
> header is included in its entirety with any copy distributed."
> It would be helpful if UKDA and OTA didn't impose these extra barriers
> to access for such corpora. I wonder what criteria they use in
> approving an application. It would also be helpful if they made a
> sample of the data available so users could see if a corpus met their
> needs before going through the application process.
> -Steven Bird
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

More information about the Corpora mailing list