Although in some of the areas where I work small "visible" corpora are often very useful, a long time ago (Tribble, C. and G.Jones, (1990) Concordances in the Classroom, Harlow: Longman) I discussed how corpus tools can "make the invisible visible". Since then I am strongly aware of how I have been standing on the shoulders of colleagues who have been redefining my understanding of language through studies of very large data sets. Yes, in part of my work I draw on insights from Swales' work in genre analysis or the pioneering critical studies such as Fowler, R., (1991) Language in the news, London: Routledge, and from the kinds of simple counting that Halliday recommended as far back 1973 in his seminal study of Golding's "The Inheritors" (Halliday, M.A.K., (1973) Explorations in the Functions of Language, London: Edward Arnold). But I also know that an awful lot of the valuable insights about the content of what needs to be taught which have emerged in the last 20 years have arisen from work on very large corpora that not humanly viewable.
Gosh - that's got that off my chest anyway. What an intersting discussion this is proving to be!
Chris -- IN LONDON TODAY Dr Christopher Tribble EMAIL || ctribble at clara.co.uk WEB || www.ctribble.co.uk BLOG || http://ctribble.blogspot.com
> -----Original Message-----
> From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no]
> On Behalf Of Yorick Wilks
> Sent: 19 August 2008 15:31
> To: Lou Burnard
> Cc: CORPORA List
> Subject: Re: [Corpora-List] Boot Camp (Continued...)
> I hope it is obvious that nothing I wrote earlier was with
> any intended disrespect to the study of corpora in language
> teaching and translation; both are activities of which I
> think highly, of course.
> Christopher Tribble's contribution was the clearest reference
> to the practice of using corpora in these activities, and I
> would maintain it is, however much improved and refined, a
> traditional activity known for centuries; it requires corpora
> to be humanly "viewable"--to use the same rather
> unsatisfactory word as I did earlier. That seems to me a long
> way from computing over very large corpora that are not
> humanly viewable or assimilable or even readable: it is that
> activity that both computational linguistics and corpus
> linguistics claim to be involved in, and it is there that the
> source of the negativity seems to be. This long discussion
> has been full of little remarks by corpus linguists pulling
> their skirts tight about them and saying how they dont want
> to read or know about, let alone do, what computational
> linguists do with corpora. It is that reaction that I still
> find puzzling; why does anyone care what consenting adults do
> with corpora in the privacy of their computers? Such activity
> either produces demonstrable results, or useful artifacts,
> or it does not--what else is there to say?
> I confessed earlier that CL/NLP is having a dull patch as a
> whole, but that is not true of machine translation, which is
> having a mini- renaissance, with method sometimes called
> statistical (following Mercer and Jelinek) and sometimes
> example-based (following Nagao). In fact there is no real
> difference between them and both rest entirely on corpus data
> provided by human translators, whose skill they attempt to
> learn, and with increasing success, as any user of internet
> free translators knows. There is no clear dividing line here
> at all between the parts of this large field, only, it seems,
> bad feelings.
> On 19 Aug 2008, at 10:52, Lou Burnard wrote:
> > Yorick says "what I have never seen able to see is what corpus
> > linguistics (in the sense in which the phrase is owned by the main
> > contributors to this debate) is FOR, except the production
> of better
> > dictionaries"
> > As far as I am aware one of the largest communities interested in
> > consuming the fruits of corpus linguistics (whether you're talking
> > about corpora or the methods attached to them) is that of people
> > engaged in the humdrum but utterly mysterious business of language
> > teaching and translation.
> > Sadly, none of that community seems to have seen fit (yet) to
> > contribute to the present discussion. But I think if they did they
> > might suggest that corpus linguistics is very definitely
> "for" those
> > wanting to ground their pedagogic practice in language as
> > rather than language as theorized (which is of course
> experience too,
> > but not quite the same order).
> > Lou
> Corpora mailing list
> Corpora at uib.no