[Corpora-List] Advance notice: ‘Corpus Linguistics with R’ and ‘Statistics for linguistics with R’ bootcamps by S.T. Gries (Louvain-la-Neuve, Belgium, August 2019)

Magali Paquot magali.paquot at uclouvain.be
Tue Nov 13 09:29:26 CET 2018

The Linguistics Research Unit of the Institute of Language and Communication (Université catholique de Louvain, Belgium) will be hosting two 30-hour bootcamps by Stefan Gries next summer.

The ‘Corpus Linguistics with R’ bootcamp (12-16 Aug 2019) is a hands-on introduction to using the programming language R for the analysis of textual data (mostly corpora, but theoretically also literary works, web data, etc.). It is based on the second edition (2016) of Gries’s textbook /Quantitative corpus linguistics with R/ <https://www.amazon.com/dp/1138816280> and introduces a variety of programming constructs required for text processing and corpus exploration including

* building word frequency lists and computing type-token ratios;

* computing dispersion and key words statistics;

* extracting concordance lines.

For that, we will discuss different relevant functions and data structures, control flow structures such as loops and conditionals, and a sizable number of regular expressions; in addition and time permitting, we will also cover very elementary basics of data visualization. The kinds of data dealt with in this course come from a variety of differently formatted/annotated corpora and will also include 1-2 examples of literary works and/or XML processing.

The ‘Statistics for linguistics with R’ bootcamp (19-23 Aug 2019) is a hands-on introduction to statistical methods for both graduate students and seasoned researchers and is based on the second edition (2013) of Gries’s textbook /Statistics for linguistics with R/ <https://www.amazon.com/dp/3110307286>. The course is intended for linguists who already have a basic knowledge in statistics and some experience using R, and who wish to improve their proficiency in statistical analysis of linguistic data. Using the open source software and programming language R, we will:

* briefly recap basic aspects of statistical evaluation as well as

several descriptive statistics;

* briefly discuss a selection of monofactorial statistical tests for

frequencies, means, correlations and how they constitute special

(limiting) cases of regression methods;

* explore different kinds of multifactorial and multivariate methods,

in particular different kinds of regression approaches

(fixed-effects only and mixed-effect modelling) as well as

classification trees and random forests.

Details about the previous edition of the ‘Statistics for linguistics with R’ bootcamp in LLN are available at: https://uclouvain.be/en/research-institutes/ilc/cecl/rling2017.html. For info about the prerequisites, visit https://uclouvain.be/en/research-institutes/ilc/cecl/prerequisites.html.

The website of the two events will be online in early 2019 and online registration will start on *1 March 2019. *It will be possible to register for one event only but priority will be given to people who register for both. The number of participants is limited. If you would like to participate, mark the date in your diary!

Contact email: magali.paquot at uclouvain.be

-- *Dr. Magali Paquot* /FNRS Research Associate/ Centre for English Corpus Linguistics Institut Langage et communication Université catholique de Louvain http://perso.uclouvain.be/magali.paquot/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 25645 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20181113/e9559b4c/attachment.txt>

More information about the Corpora mailing list