Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).
There's a freely available implementation and source code of it, too. (Not sure about the exact license).
Best,
-Yuval
On Fri, Oct 14, 2011 at 4:09 AM, Jin-Dong Kim <jdkim at dbcls.rois.ac.jp>wrote:
> Dear Chris,
>
> I am not sure if you consider it as a corpus linguistics study, but
> bootstrap resampling techniques were indeed used in this work:
>
> @article{Sang:2002:MSP:944790.944818,
> author = {Sang, Erik F. Tjong Kim},
> title = {Memory-based shallow parsing},
> journal = {J. Mach. Learn. Res.},
> volume = {2},
> month = {March},
> year = {2002},
> issn = {1532-4435},
> pages = {559--594},
> numpages = {36},
> url = {http://dl.acm.org/citation.cfm?id=944790.944818},
> acmid = {944818},
> publisher = {JMLR.org},
> keywords = {feature selection, memory-based learning, shallow
> parsing, system combination},
> }
>
> Hope it helps.
>
> Best,
>
> Jin-Dong
>
> On Thu, Oct 13, 2011 at 11:43 PM, <CRuehlemann at aol.com> wrote:
> > Dear all,
> >
> >
> >
> > It is not uncommon in quantitative corpus linguistic studies that a
> > significance test cannot be performed either because one cannot juxtapose
> > the distribution of a variable against the distribution of another
> > comparable variable or against a specific distribution (e.g. normal
> > distribution, exponential, etc.) or against an a priory stipulated value.
> To
> > nonetheless assess whether the distribution in the sample is simply due
> to
> > chance or a reflection of the true distribution in the population,
> > statisticians often use the bootstrap method. This method is a resampling
> > method: from the sample, a large number of resamples are drawn randomly
> and
> > with replacement.
> >
> >
> >
> > Is anyone aware of any (corpus) linguistic study/studies which has/have
> used
> > this method?
> >
> >
> >
> > Many thanks in advance
> >
> >
> >
> > Chris
> >
> > _______________________________________________
> > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > Corpora mailing list
> > Corpora at uib.no
> > http://mailman.uib.no/listinfo/corpora
> >
> >
>
>
>
> --
> Jin-Dong Kim, Ph.D,
> Project Associate Professor,
> Database Center for Life Science (DBCLS),
> Research Organization of Information and Systems (ROIS)
> home: http://dbcls.rois.ac.jp/~jdkim
> e-mail: jdkim at dbcls.rois.ac.jp
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 4052 bytes
Desc: not available
URL: <http://www.uib.no/mailman/public/corpora/attachments/20111014/665c13af/attachment.txt>