In the CoNLL shared tasks on dependency parsing in 2006 and 2007, a number of treebanks were used, some of which do not yet seem to be part of the database:
Prague Arabic Dependency Treebank Basque Dependency Treebank Sinica Treebank (Chinese) Penn Treebank (English) Tiger Treebank (German) Greek Dependency Treebank Szeged Treebank (Hungarian) Italian Syntactic-Semantic Treebank Verbmobil Treebank (Japanese) Floresta Sintactica (Portuguese) Metu-Sabanci Turkish Treebank
Of course, not all of these are genuine dependency treebanks, but judging from the treebanks included in the database so far, this does not seem to be a necessary requirement.
In addition, there are two depedency treebanks for Latin, although I think they are being merged, and a third treebank for Italian (the Venice Italian Treebank), which also exists in a dependency version.
On Sun, 3 Feb 2008, Eric Atwell wrote:
> On Fri, 1 Feb 2008, Olga Pustylnikov wrote:
> > My question is: do other treebanks exist which are not part of the database?
> > If you know of an existing treebank that should be transformed into the
> > unified format please, let me know.
> The AMALGAM multi-parsed treebank is a small sample of 60 sentences
> parsed according to 14 different parsing schemes (parser outputs or
> corpus annotation schemes); it might be an interesting challenge to
> see whether/how these different representations can be transformed
> into eGXL.
> Eric Atwell, University of Leeds, WWW/email: google Eric Atwell
> Corpora mailing list
> Corpora at uib.no
================================================================== Joakim Nivre
Växjö University Uppsala University School of Mathematics Department of Linguistics and Systems Engineering and Philology SE-35195 Växjö Box 635, SE-75126 Uppsala
Tel: +46 470 708992 Tel: +46 18 4717009 Fax: +46 470 84004 Fax: +46 18 4711094 E-mail: nivre at msi.vxu.se E-mail: joakim.nivre at lingfil.uu.se
URL: http://www.msi.vxu.se/users/nivre ==================================================================