[Corpora-List] Copy of the Hewlett-Packard test suite?

Stephan Oepen oe at ifi.uio.no
Thu Nov 27 23:48:00 CET 2008


hi kevin, my apologies for a late reply to your query!


> Can anyone point me towards a copy of the old Hewlett-Packard
> syntactic test suite?

dan flickinger has been maintaining the original HP test suite as part of his work on the English Resource Grammar. the HP data was imported into the TSNLP annotation scheme in the mid-1990s (and annotated using the relatively shallow TSNLP phenomenon classification), and under the name CSLI test suite it has been part of my [incr tsdb()] distribution in recent years.

you can browse a treebanked version (in LinGO Redwoods style) on-line:

http://erg.emmtee.net/compare?data=gold/erg/csli

the full test suite (in TSNLP format) is available for download too:

http://svn.emmtee.net/tags/handon/lingo/lkb/src/tsdb/skeletons/english/csli

to just extract the actual test items plus grammaticality judgements, the following should work:

awk -F@ '{printf("[%d] %s%s\n", $1, $8 ? "" : "*", $7)}' item

for all i recall, dan may have made a tiny number of adjustments since the original HP release. but if so, these changes would be trivial in nature, i believe. maybe dan or someone else still has a copy of the original HP file? i think i do too, only i cannot find it :-).

all best - oe

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++ Universitetet i Oslo (IFI); Boks 1080 Blindern; 0316 Oslo; (+47) 2284 0125 +++ CSLI Stanford; Ventura Hall; Stanford, CA 94305; (+1 650) 723 0515 +++ --- oe at ifi.uio.no; oe at csli.stanford.edu; stephan at oepen.net --- +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



More information about the Corpora mailing list