[Corpora-List] Syntactic parsing performance by humans?

Darren Cook darren at dcook.org
Fri May 13 13:55:59 CEST 2016


Google have trained a neural net (part of publicizing their open-source TensorFlow framework?) to parse syntax, claiming it is the world's best:

http://googleresearch.blogspot.co.uk/2016/05/announcing-syntaxnet-worlds-most.html

I just wanted to quote this bit, on performance: (they've called in Parsey McParseface)

"Parsey McParseface recovers individual dependencies between words with over 94% accuracy, ... While there are no explicit studies in the literature about human performance, we know from our in-house annotation projects that linguists trained for this task agree in 96-97% of the cases ... Sentences drawn from the web are a lot harder to analyze, ...[it] achieves just over 90% of parse accuracy on this dataset. "

Are there really no studies of human performance?! Surely some professor has hinted to their PhD students that it is a nice bit of relatively easy linguistics research, that should also get them cited a lot...

(I was mainly curious what the human performance gap between Penn Treebank and Google WebTreebank would be; if it would be more or less than the 4% gap for the deep learning algorithm.)

Darren



More information about the Corpora mailing list