[Corpora-List] Syntactic parsing performance by humans?

Yevgeni Berzak berzak at mit.edu
Tue May 17 15:00:29 CEST 2016


Dear Darren and corpora members,

following a recent discussion on this list concerning human agreement in syntactic parsing, we wanted to let you know about a new study that we conducted, which addresses this question. Using sentences from the Penn Treebank WSJ, our expert annotators agreed in ~94% of the cases, on par with state of the art parsing performance on this dataset.

A preprint of the study, which also discusses the issue of annotation bias, is available here: Yevgeni Berzak, Yan Huang, Andrei Barbu, Anna Korhonen and Boris Katz (2016). Bias and Agreement in Syntactic Annotations <http://arxiv.org/pdf/1605.04481v1.pdf>. arXiv preprint.

Best regards,

Yevgeni

---------------------------------------------------------------------------------------

Date: Fri, 13 May 2016 12:55:59 +0100 From: Darren Cook <darren at dcook.org> Subject: [Corpora-List] Syntactic parsing performance by humans? To: "corpora at uib.no" <corpora at uib.no>

Google have trained a neural net (part of publicizing their open-source TensorFlow framework?) to parse syntax, claiming it is the world's best:

http://googleresearch.blogspot.co.uk/2016/05/announcing-syntaxnet-worlds-most.html

I just wanted to quote this bit, on performance: (they've called in Parsey McParseface)

"Parsey McParseface recovers individual dependencies between words with over 94% accuracy, ... While there are no explicit studies in the literature about human performance, we know from our in-house annotation projects that linguists trained for this task agree in 96-97% of the cases ... Sentences drawn from the web are a lot harder to analyze, ...[it] achieves just over 90% of parse accuracy on this dataset. "

Are there really no studies of human performance?! Surely some professor has hinted to their PhD students that it is a nice bit of relatively easy linguistics research, that should also get them cited a lot...

(I was mainly curious what the human performance gap between Penn Treebank and Google WebTreebank would be; if it would be more or less than the 4% gap for the deep learning algorithm.)

Darren -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 4584 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20160517/8575947e/attachment.txt>



More information about the Corpora mailing list