[Corpora-List] Bootcamp: 'Quantitative Corpus Linguistics with R'-- re Louw's endorsement

Wolfgang Teubert w.teubert at bham.ac.uk
Thu Aug 14 12:32:57 CEST 2008


Dear All,

I find the interaction between Bill Louw and Stefan Gries on this list so exciting that I cannot resist the temptation to contribute to it. Of course Bill Louw gets it wrong by expecting a bootcamp to be anything like a conference. The corpus tells us that a boot camp is

another word for a military training camp which was used during World War II and many other wars

a very strict, highly structured facility with staff that act as drill instructors

more like a review that prepares you for an exam.

Dagmar S. Divjak's and Stefan Gries' boot camp is, as I see it, not about discussing corpus linguistics, but rather tells participants

how to generate frequency lists;

how to search for words and patterns;

how to handle corpora and perform corpus-linguistic searches that typical corpus software does not support;

how to carry out basic statistical evaluations of corpus data (significance tests and statistical graphs).

Gries claims that statistics clearly plays a subordinate role in this syllabus, but also that R-based software tools will be made available that allow to easily perform many of the above operations. The title of the event is: "Quantitative Corpus Linguistics with R." The provider of this software tells us: "R is a free software environment for statistical computing and graphics." (http://www.r-project.org/)

For R-software, it does no matter what kind of strings of information bit are processed. It could be language, but it could also be DNA sequences or the ciphers behind the "3." in the number pi. To me it seems that much of what will be presented at the camp is relatively application-free. Language is just one of many possible applications. What is not discussed is what a morpheme is, what makes a sentence a sentence, or how we can measure language acquisition. What is not mentioned is meaning.

But then we have to remember that Stefan Gries wears at least two hats. The journal he co-edits bears the name Corpus Linguistics and Linguistic Theory. The only language theory that Gries accepts is cognitive linguistics. His homepage leaves us in no doubt. Meaning, for Gries, is a theoretical and therefore a cognitive concept. It plays no role in his version of corpus linguistics.

Old-fashioned corpus linguists like myself have to accept that the label corpus linguistics has, over the last decade, been hijacked by theoretical linguists of all feathers. What used to be and still is for some of us a radically different, a new way to look at language, has been foreshortened to a bunch of methods, a toolbox to "search for words and patterns." Its role is to provide empirical data that will then be interpreted from the theoretical platform of cognitive linguistics. Corpus linguists are not innocent of this trend. At home in applied linguistics, they have often shied away from formulating the fundamental difference between the two approaches: For cognitive linguists, meaning is in the individual, monadic minds of speakers and hearers; for corpus linguists, meaning is in the discourse (or the corpus, as a sample thereof).

For Bill Louw, the inspirational theoretician of my version of corpus linguistics, collocation, and certainly not statistics, is at the very heart of meaning. It is how meaning configures itself within a text and within the discourse. It relates a phrase we find in a text to the discourse at large. It allows us to investigate meaning through intertextual links and through paraphrase. It does not supply us with a hypothetical model of the meaning of a phrase, as cognitive linguistics does. Rather it presents the evidence of the meaning itself. It is then up to the interpretive community to make sense of it. Language is symbolic. Meaning has to be negotiated. It is irreducible to neurons firing in our brains.

Cognitive linguistics tells Stefan Gries what a morpheme, a word, a phrase or a pattern is. This, then, is his input into the toolbox that he and many others now call corpus linguistics. Corpus linguists still don't know what a morpheme, a word, a phrase or a pattern is. That is why they always insist on discussing collocation. But they know that words change their meaning. There would be no innovation without the re-interpretation of what is there. Stefan Gries' brand of corpus linguistics may well be our brave new world. It is, however, not John Sinclair's corpus linguistics.

Wolfgang Teubert



More information about the Corpora mailing list