[Corpora-List] NEW NEWSGROUP: Corpus Linguistics with R

Stefan Th. Gries stgries_lists at arcor.de
Thu Mar 22 23:04:01 CET 2007

Dear list members

This is to alert you to a new mailing newsgroup, which may be of relevance to you. This group is concerned with corpus-linguistic applications of what I consider the corpus linguist's Swiss Army Knife, R. R (<http://www.r-project.org>) has the following characteristics that make it the ideal tool for any corpus linguist:

- it is a full-fledged programming language, i.e., it has hardly any restrictions on what corpus linguists can do with it and can therefore be used to generate all essential corpus-linguistic output formats (frequency lists, concordances, and collocation displays) as well as many other things no ready-made tool can provide;
- as a programming language, it leaves the user in charge of retrieval settings rather than sometimes difficult-to-identify program settings but at the same time R is much easier to handle than languages such as Perl or Python, which many find too daunting to learn;
- R's capabilities for statistical and graphical analyses of corpus data excel over manual or spreadsheet-based evaluation;
- it is open source software for Microsoft Windows, Mac OS, and Linux/Unix;
- there is a lively research community out there, constantly developing new stuff and providing the ideal basis for scientific exchange.

R has also become increasingly popular in the general linguistics community, as is evidenced by a variety of textbooks that are about to be published both in corpus linguistics and statistics for linguistics. In the hope to be able contribute to this lively research community, here are some details about the new mailing group:

- its name: CorpLing with R
- its URL: <http://groups.google.com/group/corpling-with-r>
- its first purpose: this group is concerned with using R for corpus linguistics; thus, postings on loading, searching, and processing all kinds of corpora (with/without regular expressions), performing statistical/graphical analyses of corpus data are more than welcome
- its second purpose: to host the companion website of my intro to quantitative corpus linguistics with R, which will be published by Routledge at the end of this year / at the beginning of next year

Feel free to have a look at the list or, even better, sign up to post questions, comments, suggestions, and the like.

Stefan Th. Gries
University of California, Santa Barbara

Viel oder wenig? Schnell oder langsam? Unbegrenzt surfen + telefonieren
ohne Zeit- und Volumenbegrenzung? DAS TOP ANGEBOT JETZT bei Arcor: günstig
und schnell mit DSL - das All-Inclusive-Paket für clevere Doppel-Sparer,
nur 39,85 € inkl. DSL- und ISDN-Grundgebühr!

More information about the Corpora-archive mailing list