[Corpora-List] old/modern english corpus data??

George Walkden george.walkden at manchester.ac.uk
Fri Nov 9 10:21:09 CET 2012


Dear Jungsoo,

There's also the Parsed Corpus of Early English Correspondence (PCEEC), freely available via the Oxford Text Archive: http://www-users.york.ac.uk/~lang22/PCEEC-manual/index.htm.

It has 2.2 million words from 1410-1695. A bit earlier than the ones Kat mentions, but it has the advantage of being POS-tagged (though not lemmatized).

Best,

- George


:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
George Walkden Lecturer in English Linguistics University of Manchester george.walkden at manchester.ac.uk<mailto:george.walkden at manchester.ac.uk> http://personalpages.manchester.ac.uk/staff/george.walkden/ Office: N1.2 Samuel Alexander Building Tel.: +44 (0)161 275 8905
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

On 9 Nov 2012, at 01:00, "K Gupta" <k.e.gupta at gmail.com<mailto:k.e.gupta at gmail.com>> wrote:

Dear Jungsoo,

You may find the following helpful:

Corpus of Late Modern English Texts - https://perswww.kuleuven.be/~u0044428/ It comprises of two sections: the Corpus of Late Modern English Texts (CLMET) and the Corpus of Late Modern English Texts Extended Version (CLMETEV). Both comprise of texts arranged in the following time periods: 1710-1780, 1780-1850, and 1850-1920. The texts are varied in terms of genre, ranging from personal letters to literary fiction to scientific writing but inevitably has more formal prose.

Zurich English Newspaper Corpus - http://www.helsinki.fi/varieng/CoRD/corpora/ZEN/index.html 349 complete newspaper issues published between 1661 and 1791, and contains 1.6 million words

The Lampeter Corpus of Early Modern English Tracts - http://ota.ox.ac.uk/headers/2400.xml Tracts and pamphlets published between 1640 and 1740, organised into the categories of religion, politics, economy and trade, science, law and miscellaneous. There are 120 different texts, amounting to 1.1 million words

Best wishes, Kat

On 9 November 2012 00:23, Jungsoo Kim <jungsookim0845 at gmail.com<mailto:jungsookim0845 at gmail.com>> wrote: Does anyone know where to find freely available online old-/modern- English corpora, whose data are before 1800 (Googlebooks corpora are not ideal for me)? It would be more than wonderful if they have a search function that enable us to search data based on words, lemma, and parts of speech.

I would be really grateful for any sorts of help, Jungsoo

_______________________________________________ UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora Corpora mailing list Corpora at uib.no<mailto:Corpora at uib.no> http://mailman.uib.no/listinfo/corpora

_______________________________________________ UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora Corpora mailing list Corpora at uib.no<mailto:Corpora at uib.no> http://mailman.uib.no/listinfo/corpora -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 4760 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20121109/0421b482/attachment.txt>



More information about the Corpora mailing list