[Corpora-List] old/modern english corpus data??

K Gupta k.e.gupta at gmail.com
Fri Nov 9 01:47:01 CET 2012


Dear Jungsoo,

You may find the following helpful:

*Corpus of Late Modern English Texts* - https://perswww.kuleuven.be/~u0044428/ It comprises of two sections: the Corpus of Late Modern English Texts (CLMET) and the Corpus of Late Modern English Texts Extended Version (CLMETEV). Both comprise of texts arranged in the following time periods: 1710-1780, 1780-1850, and 1850-1920. The texts are varied in terms of genre, ranging from personal letters to literary fiction to scientific writing but inevitably has more formal prose.

*Zurich English Newspaper Corpus* - http://www.helsinki.fi/varieng/CoRD/corpora/ZEN/index.html 349 complete newspaper issues published between 1661 and 1791, and contains 1.6 million words

*The Lampeter Corpus of Early Modern English Tracts* - http://ota.ox.ac.uk/headers/2400.xml Tracts and pamphlets published between 1640 and 1740, organised into the categories of religion, politics, economy and trade, science, law and miscellaneous. There are 120 different texts, amounting to 1.1 million words

Best wishes, Kat

On 9 November 2012 00:23, Jungsoo Kim <jungsookim0845 at gmail.com> wrote:


> Does anyone know where to find freely available online old-/modern-
> English corpora, whose data are before 1800 (Googlebooks corpora are not
> ideal for me)? It would be more than wonderful if they have a search
> function that enable us to search data based on words, lemma, and parts of
> speech.
>
> I would be really grateful for any sorts of help,
> Jungsoo
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2729 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20121109/4d70d404/attachment.txt>



More information about the Corpora mailing list