[Corpora-List] Query: Corpora of American and British English that can be compared?

Laure Gardelle laure.gardelle at ens-lyon.fr
Thu Dec 20 16:31:39 CET 2012


Many thanks for your replies, which are very helpful!

All the best

Laure

Eric Atwell <E.S.Atwell at leeds.ac.uk> a écrit :


> I agree with Adam and True Friend: LOB v Brown are the long-standing
> established corpora to compare UK v US English, from 1960s.
> BUT you asked for " ... sufficiently close collection procedures
> for the hits they return to be compared ..." whcih suggests you really
> want web-as-corpus collections gathered more recently by
> web-crawlers? If so: World Wide English Corpus
> http://www.comp.leeds.ac.uk/eric/wwe.shtml
> includes 2M-word samples of UK English and US English, collected
> using SketchEngine's WebBootCat web-as-corpus harvester,
> for student exercises in comparing world varieties of English
>
> Eric Atwell, Leeds University
>
>
>
>
>
> On Thu, 20 Dec 2012, Adam Kilgarriff wrote:
>
>> Dear Laure,
>> the straightforward answer is the 'Brown family' corpora - Brown and LOB
>> were compiled with just this kind of analysis in mind: they were both 1961
>> and more comparable data points are available for 1991 (FROWN and FLOB) and
>> (tho maybe this is British Englsih only) 1931, 1901 and 2006.
>>
>> You can do the comparisons easily and directly in the Sketch Engine, where
>> the data is already set up (includiung POS-tagged) and the 'Brown family'
>> corpus contains all the above except the 1901 part.
>>
>> Regards
>>
>> Adam
>>
>> On 18 December 2012 09:23, Laure Gardelle <laure.gardelle at ens-lyon.fr>
>> wrote:
>> Dear colleagues,
>>
>> For my research I need to compare one set of agreement patterns
>> in American and British English.
>> So would anyone know of two corpora (one for American English,
>> the other for British English) that would have sufficiently
>> close collection procedures for the hits they return to be
>> compared (ie. for possible differences in proportion to be
>> considered meaningful)?? Ideally I am looking for contemporary
>> English, but if the data are a bit older, it is not a problem.
>>
>> Many thanks in advance for any help with this!
>>
>> Laure Gardelle
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page:
>> http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>>
>>
>>
>> --
>> ========================================
>> Adam Kilgarriff                  adam at lexmasterclass.com                   
>>                          
>> Director                                    Lexical Computing Ltd          
>>      
>> Visiting Research Fellow                 University of Leeds      Corpora
>> for all with the Sketch Engine                 
>>                         DANTE: a lexical database for English              
>>     ========================================
>>
>>
>
> --
> Eric Atwell, Associate Professor, Language research group,
> I-AIBS Institute for Artificial Intelligence and Biological Systems
> School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
> Leeds LS2 9JT, England. TEL: 0113-3435430 FAX: 0113-3435468
> WWW: http://www.comp.leeds.ac.uk/eric
> http://www.comp.leeds.ac.uk/nlp
> http://www.comp.leeds.ac.uk/arabic
>



More information about the Corpora mailing list