I recently requested information on any *published materials* or *on-line materials* adopting a data-driven learning approach. My thanks to the following for their replies:
* Adam Turner
* Chris Tribble
* Mike Barlow
* Brett Reynolds
* Stéphanie O'Riordan
* Antoinette Renouf
* James Thomas
* Linda Bawcom
* Marcia Veirano Pinto
* Przemek Kaszubski
* Simon Smith
* John Milton
Unfortunately (if unsurprisingly), there were no real additions to the publications I listed in the original mail. Is there really so little out there? Why? One respondent commented that his name had been suggested to two different publishers "to write a corpus book for teachers/students. Both of them said they liked the idea and A said the world is not ready for it and B said that they were already doing something on corpora at the time."
Below are the main responses; I have a number of other links to resources which I'll include on a site I hope to set up this summer (linked from my homepage at http://arche.univ-nancy2.fr/course/view.php?id=967).
Some of the resources mentioned are already well-known and much-appreciated, including the following:
* *Compleat Lexical Tutor* by Tom Cobb and Chris Greaves with its
many resources http://www.lextutor.ca/ (free)
* *MICASE* for academic spoken American English
* Mark Davies' 360m-word *BYU Corpus of American English* and
interface to the *British National Corpus* among others
* Mike Scott's comprehensive *WordSmith Tools* for corpus analysis
http://www.lexically.net/wordsmith/ (a demo version can be
* *SketchEngine* by Adam Kilgarriff <http://www.kilgarriff.co.uk/>,
Pavel Rychlý <http://www.fi.muni.cz/%7Epary/> & Jan Pomikálek
corpus interface and word profiler http://www.sketchengine.co.uk/
(30-day free trial)
The following projects are mentioned by people closely connected with them; comments are from their mails and/or from the sites themselves:
*Adam Turner: /Hanyang University Online Writing Lab (OWL)/* www.hanyangowl.org <http://www.hanyangowl.org/> I have used the advanced search functions of Adobe PDF files with Korean graduate students writing for publication in English especially in the sciences. It is much more user-friendly than concordance software and can be used almost immediately in the classroom. When combined with a specialized small corpus it is very effective. I have received good feedback from faculty who have taken workshops from me using this approach. Not being an engineer but an English for Specific Purposes instructor, I also used this approach to help me build up samples of text in the creation of materials for engineering research writing which informed an in-house guide to Engineering writing that I wrote. The full-text of the book and the workshop handouts on how to use Google Scholar and Adobe Acrobat advanced search functions with students and faculty can be found here. Look under ESSENTIAL HANDOUTS on the right sidebar.
*Mike Barlow: /CorpusLAB/* http://www.corpuslab.com/ CorpusLAB is a new FREE site for language learners and language teachers. CorpusLAB is designed to make use of the results of corpus analysis to promote language learning based on real English used in different settings. Students can use the site to take a variety of exercises created by teachers. Go to the Student pages and select a topic area (/Academic English, Business English, /etc.). If students register, they will be able to keep track of their progress. Teachers can use the site in different ways. The central engine of the site is a series of exercise authoring tools. The exercises, fill-the-gap, multiple-choice, matching, reorder, and categorise, follow the traditional pattern, but they are designed in a way that promotes the learning of collocations and phrasal patterns. For example, the matching exercise allows up to five columns of items rather than the usual two, thereby providing practice in a range of collocations and phrases. Another feature of the site is the sharing of corpus resources and corpus-informed materials such as wordlists, handouts, ppts, etc. One of the aims of the site is to build up resources for specialised English: Medical English, English for Tourism, and so on. In addition, teachers have access to a corpus of spoken professional English via a simple concordancer. A utility for the analysis of potential teaching texts is also under development.
*Brett Reynolds: /Simple English Wiktionary/* http://simple.wiktionary.org/ The Simple English Wiktionary incorporates corpus data in selecting examples sentences and presents lemmas in frequency order as much as possible. This is a project that I'm heavily involved in and can speak more about. Because of its open aspect the amount of data used in writing it varies considerably between editors. [...] Many examples are taken from the BNC, though they are sometimes edited. For instance, the noun 'intensity' turns up this text: "...but there are many others that suffer from high intensity of sunlight." That's been edited to "These flowers suffer from high intensity of sunlight."
*Antoinette Renouf: /WebCorp Linguist's Search Engine/*/ /http://wse1.webcorp.org.uk/preview/* *...a resource which in its previous guise as WebCorp (http://www.webcorp.org.uk/wcadvanced.html) allowed thousands of learners and others to access the Web as a 'corpus', or at any rate a ready source of up to date language data, the output tailored by WebCorp tools for easy use. This activity started from 2001 and continues, but the new version of WebCorp, /WebCorp Linguist's Search Engine/, has its own search engine, allowing us both to bypass Google and other non-linguistically-oriented search engines, and to create pre-processed subcorpora to suit individual users. The demo for the latest tool is at http://wse1.webcorp.org.uk/preview/ , but people will need to ask us for a password, as the site is still under development and we are working with identified users still. The publications associated with both WebCorp systems can be found at: http://rdues.bcu.ac.uk/bibliog.shtml. The publications associated with both WebCorp systems can be found at: http://rdues.bcu.ac.uk/bibliog.shtml. The early papers (before 2000) refer to WebCorp, but after that, they refer partly or wholly also to WebCorpLSE.
*James Thomas: /A Ten-step Introduction to Concordancing through the Collins Cobuild Corpus Concordance Sampler/* http://www.fi.muni.cz/%7Ethomas/CCS/ This website is a quick rewrite of one of the same name that was created in 2002, and hosted on a public server. Since then, the Cobuild Sampler went completely off line for a long time. It is now back with some improvements that are not yet reflected in this Ten Step Intro. Concordancing for language study itself has undergone some evolution which will be reflected in the next version. It is my intention to create a similar Introduction to Bonito, the concordancer created at the Faculty where I work. For access to Bonito and other web-based concordancers, click here <http://www.fi.muni.cz/%7Ethomas/EAP/concordancers.htm>. (http://www.fi.muni.cz/~thomas/EAP/concordancers.htm <http://www.fi.muni.cz/%7Ethomas/EAP/concordancers.htm>)
*Przemek Kaszubski: /IFA Concordancer/* http://ifa.amu.edu.pl/~ifaconc <http://ifa.amu.edu.pl/%7Eifaconc> I have been building a site mainly for our local academic EFL purposes but with a functional public demo-mode. Little content there as yet, but I very much hope the site will grow. The tool is called IFAConc and can be found here: http://ifa.amu.edu.pl/~ifaconc <http://ifa.amu.edu.pl/%7Eifaconc>. It is partly inspired by Tom Cobb's ideas as well as Tim Johns' kibbitzer pages, and some more.
*Simon Smith: *(re */SketchEngine/* http://www.sketchengine.co.uk/) I'm involved in two projects in which users are presented with corpus data: one on Chinese, the other on English. Both of them make use of Adam Kilgarriff's Sketch Engine corpus query tool. *In the **Chinese project**,* Alice Chen and I tried to assess, using pre- and post-tests, the progress in acquisition of collocational patterns made by a group of intermediate to advanced Chinese learners. These learners were exposed for a period of time to a large corpus of Chinese, accessed through the concordances and usage summaries offered by Sketch Engine. We prepared a walkthrough guide <http://mcu.edu.tw/%7Essmith/walkthrough/> to the use of corpora for language learning in general (and the Sketch Engine in particular), and described the work in a paper <http://www.kilgarriff.co.uk/Publications/2007-SmithChenKilg-PALC.doc> given at PALC, in Lodz, last year. The results of that work were rather inconclusive, partly because our learners were left to their own devices as to how they went about exploring the corpus, and what they learned from it. In July, I'll be building on that work with a much more task focused Chinese-learning experience. This will be aimed at beginners, and will take the form of a workshop at TALC 2008, Lisbon <http://talc8.isla.pt/workshops.html#mandarin>. Participants will learn about an important collocational category in the language, that of Verb-Object Compounds, which can be readily illustrated using corpus tools, and crops up often enough and early enough in every Chinese learner's exposure to the language to merit special study. If that sounds a bit dry, we'll also be practising some basic Mandarin, and even dabbling a little in the writing system. Not to mention learning about Sketch Engine along the way. If you're going to be at TALC, please consider joining us! *The** English project* is on *corpus-generated cloze exercises.* Scott Sommers and I are presenting a paper <http://mcu.edu.tw/%7Essmith/ccu2008-smith.pdf> on this at the 2008 Conference of English Teaching and Learning in R.O.C. <http://www.ccu.edu.tw/fllcccu/2008EIA/English/Eprogram.php> A cloze exercise has three components: a cloze sentence ("The boy stood on the burning deck"), a key ("burning") and distractors ("lukewarm", "tepid","piping hot"..., for the sake of illustration). Our algorithm takes the key as input from the user, finds an appropriate sentence in the corpus, and supplies distractors (terms which have the same sort of distribution in the corpus as the key, but never actually occur with a particular collocate, such as "deck" in the example). [...] Any feedback on either of these projects would of course be most welcome!
*John Milton:* */My Words/* http://mywords.ust.hk/STU/welcome.asp You can download an MSWord toolbar called 'Check My Words' from http://mywords.ust.hk/. It takes a DDL approach to grammar-checking for learners of English, especially addressing common sentence-level errors of Chinese speakers, but useful for English learners of any L1. A companion program - 'Mark My Words' - can be used by teachers to insert comments containing relevant DDL links in students' documents.
*Linda Bawcom* mentions this page: "Professor Daniel Kies' (College of Du Page in Illinois) /The Hyper Textbook/, a textbook he wrote for his composition course. One part is dedicated to Conrad's /The Heart of Darkness /which he presents with strings from a concordancer and then he invites students to use a concordancer that he has set up. He also uses concordances in this hyper-textbook for examples in his explanations of grammar. By the way, this is not just your average run-of-the-mill 'grammar' book. It's worth browsing through if you teach composition or applied linguistics." http://papyr.com/hypertextbooks/grammar/conrad_heart_of_darkness.htm
*Marcia Veirano Pinto* sent me a selection of materials she had prepared, some in collaboration with Maria Cecília Lopes and Tony Berber Sardinha; though these are not available on the web she has given me permission to include them in the site I hope to create over the summer.
Thanks again to all alex
Alex Boulton a écrit :
> Dear all
> I'm trying to compile a list of published DDL materials for (L2)
> language learning -- not materials which are simply corpus-informed
> (from native-speaker or learner corpora), but where learners actually
> come into contact with corpus data.
> I'm particularly interested in *books, CD-ROMs, DVD-ROMS or internet
> sites* which are either wholly given over to DDL or which integrate
> DDL activities in part -- anything which shows publishers have shown
> an interest in DDL materials. (eg Tribble & Jones Concordances in the
> Classroom; Barlow & Burdine Phrasal Verbs in Business / American
> Phrasal Verbs; Thurston & Candlin Exploring Academic English; LingoNet
> VideoCorpus; etc.)
> While I'm mainly concerned with published materials, I'd also be
> interested in any links to other DDL resources which individuals or
> groups may have produced but not published, especially on-line --
> again, not corpora, tools or interfaces on their own, but activities
> explicitly based on corpora. (eg Tim Johns' Virtual DDL Library /
> Kibbitzing One-to-Ones; Estling Vannestål & Lindquist's Corpora in
> Grammar Teaching; ICT4LT; etc.)
> The above examples are inevitable English-oriented, but materials in
> or about other languages would be more than welcome.
> I will of course post results to Corpora List, but I'd also like to
> create a web page which lists them as a complement to Tim Johns'
> data-driven learning page (last revised 06/02/97), and review as many
> as possible. I'd be grateful also then for URLs and references to
> published reviews and descriptions... or even free samples if you have
> Thanks in advance
> Alex Boulton*
> boulton at univ-nancy2.fr <mailto:boulton at univ-nancy2.fr>
> Tél :
> Tél : 03.83.96.84.44
> Fax :
> Fax : 03.83.96.84.49
> * Tous les articles des /Mélanges CRAPEL/ sont
> gratuitement en ligne sous
> format pdf :*
* Alex Boulton*
boulton at univ-nancy2.fr <mailto:boulton at univ-nancy2.fr>
Tél : 03.83.96.71.30 Tél : 03.83.96.84.44
Fax : 03.83.96.71.32 Fax : 03.83.96.84.49
* Tous les articles des /Mélanges CRAPEL/ sont disponibles
gratuitement en ligne sous format pdf :*
-------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.uib.no/mailman/public/corpora/attachments/20080415/05442168/attachment.html