[Corpora-List] New resources and tools for GermaNet

Verena Henrich verena.henrich
Mon Jan 7 15:04:21 CET 2013

Apologies for cross postings ------------------------

We are happy to announce the availability of three new resources for the German wordnet GermaNet:

1) WebCAGe (short for: Web-Harvested Corpus Annotated with GermaNet Senses) is a domain-independent web-harvested corpus that has been semi-automatically annotated with senses from GermaNet. The corpus has been constructed on the basis of a the sense alignment of GermaNet senses with senses from the online dictionary Wiktionary. Wiktionary senses are frequently illustrated by one or more example sentences, which in turn are often linked to external references, including Wikipedia articles and other textual web sources. Therefore, the GermaNet-Wiktionary alignment and the various pointers contained in Wiktionary example sentences made it possible to automatically assemble a corpus annotated with GermaNet senses. In order to assure good quality, all automatic annotations have been manually verified. WebCAGe is freely available for download at: http://www.sfs.uni-tuebingen.de/en/ascl/resources/corpora/webcage.html

2) Semantic Relatedness API: To calculate semantic relatedness between two word senses in GermaNet, we have reimplemented a suite of semantic relatedness algorithms for German that are well-known for English, including the methods proposed by Leacock and Chodorow (1998), Wu and Palmer (1994), Hirst and St-Onge (1998), Resnik (1995), Jiang and Conrath (1997), Lin (1998), and Lesk (1986): http://www.sfs.uni-tuebingen.de/GermaNet/tools.shtml#SemRelAPI

3) GernEdiT (GermaNet Editing Tool) is a graphical editor that supports several ways to search and visualize GermaNet data. It can be used to browse through the GermaNet graph, to list all senses of a word, and to view all properties of a particular word sense, including relations to other word senses, Wiktionary paraphrases, and English translations. GernEdiT is used by the lexicographers to maintain and extend the GermaNet database in a user-friendly way. Now we made GernEdiT available for interested users: http://www.sfs.uni-tuebingen.de/GermaNet/tools.shtml#GernEdiT

GermaNet is a lexical-semantic net that relates German nouns, verbs, and adjectives semantically by grouping lexical units that express the same concept into synsets and by defining semantic relations between these synsets. GermaNet has much in common with the English WordNet (http://wordnet.princeton.edu) and can be viewed as an on-line thesaurus or a light-weight ontology.

GermaNet has been developed and maintained within various projects by the research group for General and Computational Linguistics (Director: Prof. Dr. Erhard Hinrichs) at the University of Tübingen since 1997.

For more information about GermaNet, please consult the project website: http://www.sfs.uni-tuebingen.de/GermaNet/

More information about the Corpora mailing list