[Corpora-List] Corpus size and accuracy of frequency listings

Michael Oakes Michael.Oakes
Fri Apr 3 19:03:03 CEST 2009

Dear Mark,

I think the answer may come from Forensic Statistics. The problem is analogous to that of how many units we should sample from a seized consignment, to estimate the proportion of units in the whole consignment which are contaminated with an illegal subsance.

Section 16.3 of David Lucy's book "Forensic Statistics", entitled "How many drugs to sample", refers to the 2003 ENFSI (European Network of Forensic Institutes) report, which is available on the web, which details afour different methods. In addition, Lucy cites Izenman(2001) who provides two "rough and ready" methods.

ENFSI (2003). Guidelines on Representative Drug Sampling Institution: European Network of Forensic Sciences Institutes Drugs Working Group.

Izenman, A. J. (2001). Statistical and Legal aspects of the forensic study of illicit drugs. Statistical Science 16(1):35-37.

Regards, Michael.

More information about the Corpora mailing list