[Corpora-List] Uses of N-grams?

Krishnamurthy, Ramesh r.krishnamurthy at aston.ac.uk
Thu Jul 18 18:03:11 CEST 2013

apologies for hitting 'send' before inserting the 'subject' heading! :(

________________________________ From: Krishnamurthy, Ramesh Sent: 18 July 2013 17:01 To: cedric.krummes at uni-leipzig.de Cc: corpora at uib.no Subject:

Hi Cedric

As we cannot be sure of the meaning or the part-of-speech of an item

from a word frequency list, are not n-grams a sort of halfway house

between word frequency lists and concordances?

To me, n-grams is just one of the tools in the corpus linguistics toolbag,

although it may be a relative newcomer, and hasn't grabbed the headlines

like keywords, perhaps.

If I remember correctly, at Cobuild, we first used bigrams for the BBC

dictionary (published in 1992). I don't think n-grams was a feature of

the earlier versions of WordSmith, and even in the more recent

AntConc, the n-grams option is slightly hidden.

Since the 1990s, I have used n-grams as a routine part of corpus

analysis, if they are available in the software I am using at the time,

for a variety of purposes (eg investigating language varieties in 'The

Globalization of Business English?' at Complex 2001; investigating

genre features in 'A corpus-based analysis of junk emails' at LREC

2002; and recently, to compare Business Spanish and Business French

in research for the COMENEGO project).

Access to Google n-grams seems to have sparked interest in studies

into historical changes in social, cultural, and political values?




Date: Thu, 18 Jul 2013 09:51:30 +0200 From: Cedric Krummes <cedric.krummes at uni-leipzig.de> Subject: [Corpora-List] Uses of N-grams? To: Corpora at uib.no


Regarding n-grams (highly frequent word sequences like "on the other hand" or "why don't you"), does anybody any uses for them apart from language teaching.

Most literature dealing with n-grams seems to apply them to foreign language teaching, second language acquisition, or English for X purposes. Any other uses?

Best wishes,

Cédric Krummes -- Dr. Cédric Krummes

Universität Leipzig ˇ +49-341-97-37404 http://www.cedrickrummes.org/contact.php

More information about the Corpora mailing list