[Corpora-List] Syntactic parsing performance by humans?

Koos Wilt kooswilt at gmail.com
Fri May 13 20:24:41 CEST 2016

Bob Berwick 20:13 (10 minuten geleden) aan mij would be useful for the list and the community. please do. Koos Wilt <kooswilt at gmail.com> 20:14 (9 minuten geleden) aan Bob Coming up tomorrow or so.

2016-05-13 19:51 GMT+02:00 Koos Wilt <kooswilt at gmail.com>:

> I wrote an overview of the performance of parsers about 4 years ago. Would
> sending it somewhere (e.g. to Mr Brew) be helpful to anyone? It's on my
> other laptop so I have to dig for it.
> Best,
> -K
> 2016-05-13 19:30 GMT+02:00 chris brew <cbrew at acm.org>:
>> It is an unarguable fact that Google's parser gets a higher score, on the
>> metrics chosen, which are completely standard in the NLP community. What is
>> really being measured is what percentage of the links in a graph that links
>> words to words via labeled links are correct. If, as is common, there are
>> many words in the sentence, there will be many links too, and many
>> opportunities for mistakes. You could get a 90% score and still have a
>> mistake or two in nearly every sentence.
>> Whether this quality level is OK depends entirely on what use you plan to
>> make of the graph that has been produced.
>> The Penn Treebank was made many years ago, with version 2 coming out in
>> 1995. We have learnt a lot about how to annotate corpora and evaluate
>> parsing since then. The Web Treebank is much newer, and reflects painfully
>> learned best practices, so should be good quality, but is on the other hand
>> dealing with much messier language, so performance scores are lower.
>> The current practice of evaluating individual dependencies was introduced
>> as a result of major deficiencies in the first evaluation metrics that were
>> used. It has the major plus of being transparent and straightforward. I
>> believe that improvements in the metric will usually translate into
>> improvements for downstream tasks that use parsing as inputs, and I wasn't
>> so sure using earlier metrics. This is progress, but quite modest progress.
>> On 13 May 2016 at 12:55, Darren Cook <darren at dcook.org> wrote:
>>> Google have trained a neural net (part of publicizing their open-source
>>> TensorFlow framework?) to parse syntax, claiming it is the world's best:
>>> http://googleresearch.blogspot.co.uk/2016/05/announcing-syntaxnet-worlds-most.html
>>> I just wanted to quote this bit, on performance: (they've called in
>>> Parsey McParseface)
>>> "Parsey McParseface recovers individual dependencies between words
>>> with over 94% accuracy, ... While there are no explicit studies in the
>>> literature about human performance, we know from our in-house annotation
>>> projects that linguists trained for this task agree in 96-97% of the
>>> cases ... Sentences drawn from the web are a lot harder to analyze,
>>> ...[it] achieves just over 90% of parse accuracy on this dataset. "
>>> Are there really no studies of human performance?! Surely some professor
>>> has hinted to their PhD students that it is a nice bit of relatively
>>> easy linguistics research, that should also get them cited a lot...
>>> (I was mainly curious what the human performance gap between Penn
>>> Treebank and Google WebTreebank would be; if it would be more or less
>>> than the 4% gap for the deep learning algorithm.)
>>> Darren
>>> _______________________________________________
>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>>> Corpora mailing list
>>> Corpora at uib.no
>>> http://mailman.uib.no/listinfo/corpora
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 9251 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20160513/b7207c94/attachment.txt>

More information about the Corpora mailing list