I must confess, the idea that a corpus can be described in terms of "parseability" sounds a little ill-founded to me. The choice of particular parsing algorithm may dictate which examples are hard to process, as will the underlying grammar etc etc. <br>
<br>What would be interesting (read: hard) would be to look at the work on phase transitions in 3-sat problems and the like. So, are there underlying graph-related characteristics of parsing which make certain sentences intrinsically hard to process and in particular can these characteristics be framed in a manner that was independent of the actual parser. <br>
<br>-- <br>The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.