[Corpora-List] offer of research resource

Geoffrey Sampson grs2 at sussex.ac.uk
Tue Jun 27 12:31:01 CEST 2006


Dear Colleagues,

I am looking for someone who would be interested in taking over
responsibility for a valuable research resource I have been in charge of
in recent years.

During the 1960s, a team of linguists sponsored by the Nuffield
Foundation assembled a collection of the spontaneous spoken and written
English of children and young people aged between 8+ and 15+ attending a
variety of schools of diverse types in different urban and rural English
regions: the "Child Language Survey". (This was initially intended as
part of a multinational effort directed at improving foreign-language
teaching in Europe, but I understand that parallel efforts in other
countries fell through; the material has essentially been gathering dust
more or less ever since it was compiled.) The leading member of the
team was Richard Handscombe, now long since retired from a Canadian
university and in indifferent health. After I used a small portion of
the Survey for my LUCY treebank (www.grsampson.net/RLucy.html), Richard
generously suggested that I should take charge of the entire Survey
material, and arranged for it to be transported to my workplace in
Sussex, where it now is.

Since then, I have made repeated attempts to get funding to computerize
this material, clearly a necessary first step to unlocking the research
potential it contains. Although referees' reports on my various grant
applications have been outstandingly positive, unfortunately no
application has finally succeeded. I now find myself too close to
retirement for a further application to be worth making; even if I
secured funding now, I would not have time to see the work through to
completion. Hence I would be interested in hearing from anyone younger
who might succeed where I have failed.

In my view the collection has unparalleled potential scientific value.
In the first place, it creates a possibility (which otherwise scarcely
exists) of comparing spontaneous English usage across several decades of
time -- children of the 1960s with children now, and/or the usage of a
generation in childhood with the usage of the same generation now it is
middle-aged. One can envisage many significant applications to the
study of language-skills education, for instance. One anonymous grant
referee in 2005 commented:

"there is a yawning gap where there should be a research literature
on grammatical development at school age (contrasting with a rich supply
of research on both pre-school children and adults). What is needed
more than anything else is precisely what this project offers: age-
related data on speech and writing from the same children ..."

The written portion of the material represents children's spontaneous
writing abilities in a way which in my experience is hard to match even
for present-day children. Collections of child writing often turn out to
be heavily influenced by the adult prose they have consulted, but the
Child Language Survey compilers found clever ways to get at what the
children could do under their own steam. And the quality of the
collection is extremely high. The spoken material has been transcribed
with an accuracy that compares very favourably with the speech
transcriptions in the British National Corpus (and I have the original
tape-recordings as well as the transcriptions). The written material
has been converted from the children's handwriting into typescript with
astonishing care, so that for instance every crossed-out letter is
identified. As a very rough estimate, the whole might comprise about
800,000 words of speech and about 200,000 words of writing.

It will be a minor scientific tragedy, to my mind, if this material is
lost to scholarship. Yet, if I cannot find a suitable home for it
fairly soon, that fate looks unavoidable.

Accordingly, I should be very happy to hear from anyone who feels able
to rescue the Child Language Survey from oblivion. After handing it
over, I would be willing, indeed eager, to retain an involvement, to the
extent of advising on what I know about it, etc., but decisions would be
for the new owner to make: I have no wish to be a back-seat driver. I
would be quite willing to transfer the collection out of Britain -- I
have the impression that scholarly values may be in a better state in
some Continental European countries, for instance, than they are in
British universities nowadays. (And I would be glad to supply
documentation on my grant applications, referee reports, etc., if they
would help someone else construct a case for support.)

Anyone who would like to be considered is invited to contact me,
commenting briefly on how he or she would hope to publish and/or exploit
the material, and we can take it from there.

Geoffrey Sampson


............................................................
Prof. Geoffrey Sampson MA PhD MBCS CITP ILTM

author of "The 'Language Instinct' Debate"

Department of Informatics, University of Sussex
Falmer, Brighton BN1 9QH, England

www.grsampson.net +44 1273 678525
............................................................






More information about the Corpora-archive mailing list