[Corpora-List] Ethical review of spoken corpus collection

John Du Bois dubois at linguistics.ucsb.edu
Sun Apr 10 04:19:17 CEST 2011

Regarding corpora of spoken language, one approach is to adopt a PUBLICATION model. From the beginning you tell people that the transcriptions and audio will be published. (This is what we did with the Santa Barbara Corpus of Spoken American English, published by the Linguistics Data Consortium http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2000S85; see also http://www.linguistics.ucsb.edu/research/sbcorpus.html.) Once it is published, it is simply out there on CD's etc. in libraries etc., and can be used for eternity like any other published document you encounter in the library. Obviously this requires appropriate consent from the beginning, but it is worth going to the trouble to get it.

John Du Bois Department of Linguistics University of California, Santa Barbara

More information about the Corpora mailing list