[Corpora-List] Syntactic parsing performance by humans?

Koos Wilt kooswilt at gmail.com
Fri May 13 21:37:02 CEST 2016


OK, this, at least, is conform to the original request: "Parsing, comparisons between parsers, and other comparative studies, where components are viewed as modular entities in an entire sysrtem, are the subject of [2, 3, 9, 23, 25, 28, 32, 33]. " The numbers refer to my bibliography. The bibliography was compiled for the Wageningen University Department of Plant Sciences for Ms Judith Risse, PhD candidate under Prof Jack Leunissen.

https://www.google.nl/search?q=%5Bpicture+Wageningen+UNiversity&newwindow=1&biw=1517&bih=741&tbm=isch&imgil=n4UMApdo10HSrM%253A%253Bgnx1dqe-HmpP3M%253Bhttps%25253A%25252F%25252Fen.wikipedia.org%25252Fwiki%25252FWageningen_University_and_Research_Centre&source=iu&pf=m&fir=n4UMApdo10HSrM%253A%252Cgnx1dqe-HmpP3M%252C_&usg=__mq9ixIYqhn5S521QCQetcNEFXLU%3D&dpr=0.9&ved=0ahUKEwj99fid5tfMAhVC6xQKHWB0DE0QyjcIJQ&ei=tiw2V_3CCMLWU-DosegE#imgrc=n4UMApdo10HSrM%3A

-K

2016-05-13 21:31 GMT+02:00 Koos Wilt <kooswilt at gmail.com>:


> It is not so much an overview of parser performance as it is a
> bibliography. Many of the entries, however, do contain parser evaluations.
> Hope it's still useful.
>
> -K
>
> 2016-05-13 20:26 GMT+02:00 Koos Wilt <kooswilt at gmail.com>:
>
>> Also please allow me to give a plug for the Stanford Parser. I cannot
>> claim if performs worse or better than Google's, but it's become my trusty
>> of war-horse.
>>
>> 2016-05-13 20:24 GMT+02:00 Koos Wilt <kooswilt at gmail.com>:
>>
>>> Bob Berwick
>>> 20:13 (10 minuten geleden)
>>> aan mij
>>> would be useful for the list and the community. please do.
>>> Koos Wilt <kooswilt at gmail.com>
>>> 20:14 (9 minuten geleden)
>>> aan Bob
>>> Coming up tomorrow or so.
>>>
>>> 2016-05-13 19:51 GMT+02:00 Koos Wilt <kooswilt at gmail.com>:
>>>
>>>> I wrote an overview of the performance of parsers about 4 years ago.
>>>> Would sending it somewhere (e.g. to Mr Brew) be helpful to anyone? It's on
>>>> my other laptop so I have to dig for it.
>>>>
>>>> Best,
>>>>
>>>>
>>>> -K
>>>>
>>>> 2016-05-13 19:30 GMT+02:00 chris brew <cbrew at acm.org>:
>>>>
>>>>> It is an unarguable fact that Google's parser gets a higher score, on
>>>>> the metrics chosen, which are completely standard in the NLP community.
>>>>> What is really being measured is what percentage of the links in a graph
>>>>> that links words to words via labeled links are correct. If, as is common,
>>>>> there are many words in the sentence, there will be many links too, and
>>>>> many opportunities for mistakes. You could get a 90% score and still have a
>>>>> mistake or two in nearly every sentence.
>>>>>
>>>>> Whether this quality level is OK depends entirely on what use you plan
>>>>> to make of the graph that has been produced.
>>>>>
>>>>> The Penn Treebank was made many years ago, with version 2 coming out
>>>>> in 1995. We have learnt a lot about how to annotate corpora and evaluate
>>>>> parsing since then. The Web Treebank is much newer, and reflects painfully
>>>>> learned best practices, so should be good quality, but is on the other hand
>>>>> dealing with much messier language, so performance scores are lower.
>>>>>
>>>>>
>>>>> The current practice of evaluating individual dependencies was
>>>>> introduced as a result of major deficiencies in the first evaluation
>>>>> metrics that were used. It has the major plus of being transparent and
>>>>> straightforward. I believe that improvements in the metric will usually
>>>>> translate into improvements for downstream tasks that use parsing as
>>>>> inputs, and I wasn't so sure using earlier metrics. This is progress, but
>>>>> quite modest progress.
>>>>>
>>>>>
>>>>> On 13 May 2016 at 12:55, Darren Cook <darren at dcook.org> wrote:
>>>>>
>>>>>> Google have trained a neural net (part of publicizing their
>>>>>> open-source
>>>>>> TensorFlow framework?) to parse syntax, claiming it is the world's
>>>>>> best:
>>>>>>
>>>>>>
>>>>>> http://googleresearch.blogspot.co.uk/2016/05/announcing-syntaxnet-worlds-most.html
>>>>>>
>>>>>> I just wanted to quote this bit, on performance: (they've called in
>>>>>> Parsey McParseface)
>>>>>>
>>>>>> "Parsey McParseface recovers individual dependencies between words
>>>>>> with over 94% accuracy, ... While there are no explicit studies in the
>>>>>> literature about human performance, we know from our in-house
>>>>>> annotation
>>>>>> projects that linguists trained for this task agree in 96-97% of the
>>>>>> cases ... Sentences drawn from the web are a lot harder to analyze,
>>>>>> ...[it] achieves just over 90% of parse accuracy on this dataset. "
>>>>>>
>>>>>> Are there really no studies of human performance?! Surely some
>>>>>> professor
>>>>>> has hinted to their PhD students that it is a nice bit of relatively
>>>>>> easy linguistics research, that should also get them cited a lot...
>>>>>>
>>>>>> (I was mainly curious what the human performance gap between Penn
>>>>>> Treebank and Google WebTreebank would be; if it would be more or less
>>>>>> than the 4% gap for the deep learning algorithm.)
>>>>>>
>>>>>> Darren
>>>>>>
>>>>>> _______________________________________________
>>>>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>>>>>> Corpora mailing list
>>>>>> Corpora at uib.no
>>>>>> http://mailman.uib.no/listinfo/corpora
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>>>>> Corpora mailing list
>>>>> Corpora at uib.no
>>>>> http://mailman.uib.no/listinfo/corpora
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 11291 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20160513/ed32bcfb/attachment.txt>



More information about the Corpora mailing list