[Corpora-List] offer of research resource

Martin Wynne martin.wynne at oucs.ox.ac.uk
Wed Jun 28 11:07:01 CEST 2006


Dear Geoffrey and everyone,

I've had several messages offline asking why the OTA doesn't offer to
take this resource, so before anyone else asks, I should point out that
the Oxford Text Archive and the Arts and Humanities Data Service only
archive electronic resources, and so, unfortunately, would not be able
to offer a home for this valuable data in its current state. As I
understand it, what is needed is a traditional archive for paper
documents and magnetic media, or a project to digitise the data. (But
please correct me if I'm wrong, Geoffrey.)

If anyone out there is in a position to consider undertaking a project
to digitise it, then I understand that Professor Sampson already has a
detailed workplan. To make life even easier, the AHDS would be very
happy to offer a free service to archive, catalogue, preserve and
distribute the electronic data, on a non-exclusive basis. We could also
give advice on digitisation, if needed.

Best wishes,
Martin

--
Martin Wynne
Head of the Oxford Text Archive and
AHDS Literature, Languages and Linguistics

Oxford University Computing Services
13 Banbury Road
Oxford
UK - OX2 6NN
Tel: +44 1865 283299
Fax: +44 1865 273275
martin.wynne at oucs.ox.ac.uk


Geoffrey Sampson wrote:

> Dear Colleagues,

>

> I am looking for someone who would be interested in taking over

> responsibility for a valuable research resource I have been in charge of

> in recent years.

>

> During the 1960s, a team of linguists sponsored by the Nuffield

> Foundation assembled a collection of the spontaneous spoken and written

> English of children and young people aged between 8+ and 15+ attending a

> variety of schools of diverse types in different urban and rural English

> regions: the "Child Language Survey". (This was initially intended as

> part of a multinational effort directed at improving foreign-language

> teaching in Europe, but I understand that parallel efforts in other

> countries fell through; the material has essentially been gathering dust

> more or less ever since it was compiled.) The leading member of the

> team was Richard Handscombe, now long since retired from a Canadian

> university and in indifferent health. After I used a small portion of

> the Survey for my LUCY treebank (www.grsampson.net/RLucy.html), Richard

> generously suggested that I should take charge of the entire Survey

> material, and arranged for it to be transported to my workplace in

> Sussex, where it now is.

>

> Since then, I have made repeated attempts to get funding to computerize

> this material, clearly a necessary first step to unlocking the research

> potential it contains. Although referees' reports on my various grant

> applications have been outstandingly positive, unfortunately no

> application has finally succeeded. I now find myself too close to

> retirement for a further application to be worth making; even if I

> secured funding now, I would not have time to see the work through to

> completion. Hence I would be interested in hearing from anyone younger

> who might succeed where I have failed.

>

> In my view the collection has unparalleled potential scientific value.

> In the first place, it creates a possibility (which otherwise scarcely

> exists) of comparing spontaneous English usage across several decades of

> time -- children of the 1960s with children now, and/or the usage of a

> generation in childhood with the usage of the same generation now it is

> middle-aged. One can envisage many significant applications to the

> study of language-skills education, for instance. One anonymous grant

> referee in 2005 commented:

>

> "there is a yawning gap where there should be a research literature

> on grammatical development at school age (contrasting with a rich supply

> of research on both pre-school children and adults). What is needed

> more than anything else is precisely what this project offers: age-

> related data on speech and writing from the same children ..."

>

> The written portion of the material represents children's spontaneous

> writing abilities in a way which in my experience is hard to match even

> for present-day children. Collections of child writing often turn out to

> be heavily influenced by the adult prose they have consulted, but the

> Child Language Survey compilers found clever ways to get at what the

> children could do under their own steam. And the quality of the

> collection is extremely high. The spoken material has been transcribed

> with an accuracy that compares very favourably with the speech

> transcriptions in the British National Corpus (and I have the original

> tape-recordings as well as the transcriptions). The written material

> has been converted from the children's handwriting into typescript with

> astonishing care, so that for instance every crossed-out letter is

> identified. As a very rough estimate, the whole might comprise about

> 800,000 words of speech and about 200,000 words of writing.

>

> It will be a minor scientific tragedy, to my mind, if this material is

> lost to scholarship. Yet, if I cannot find a suitable home for it

> fairly soon, that fate looks unavoidable.

>

> Accordingly, I should be very happy to hear from anyone who feels able

> to rescue the Child Language Survey from oblivion. After handing it

> over, I would be willing, indeed eager, to retain an involvement, to the

> extent of advising on what I know about it, etc., but decisions would be

> for the new owner to make: I have no wish to be a back-seat driver. I

> would be quite willing to transfer the collection out of Britain -- I

> have the impression that scholarly values may be in a better state in

> some Continental European countries, for instance, than they are in

> British universities nowadays. (And I would be glad to supply

> documentation on my grant applications, referee reports, etc., if they

> would help someone else construct a case for support.)

>

> Anyone who would like to be considered is invited to contact me,

> commenting briefly on how he or she would hope to publish and/or exploit

> the material, and we can take it from there.

>

> Geoffrey Sampson

>

>

> ............................................................

> Prof. Geoffrey Sampson MA PhD MBCS CITP ILTM

>

> author of "The 'Language Instinct' Debate"

>

> Department of Informatics, University of Sussex

> Falmer, Brighton BN1 9QH, England

>

> www.grsampson.net +44 1273 678525

> ............................................................

>

>

>









More information about the Corpora-archive mailing list