[Corpora-List] Summary of responses: German lemma list

Niels Ott niels at drni.de
Sat Mar 10 17:59:00 CET 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dear all,

over a week ago I asked for a German lemma list. I received a number of
replies. From all suggestions made, the one of extracting a lemma list
from the ispell word list won the race... because this was the easiest
thing to do in the limited time we had.

Let me briefly summarize the suggestions I received both on the list and
in private (in no particular order):

Annette Klosa offered a contract over academic use of the word list from
the Elexico project which is based in frequency data from the German IDS
corpora. http://www.elexiko.de/

Lars Aronson was the one who suggested to use German spell checker
dictionaries, namely those of ispell/aspell/myspell/hunspell.*

René Witte suggested to have a look at the Durm Lemmatizer which
apparently comes with a lexicon.*
http://www.ipd.uni-karlsruhe.de/~durm/tm/lemma/

Yannick Versley suggested to use the lexicon of the CDG parser.*
http://nats-www.informatik.uni-hamburg.de/view/CDG/DownloadPage

Peter Adolphs suggested to have a look at Morphy by Wolfgang Lezius
which can export the lexical data it uses. http://www.wolfganglezius.de/

[*]: Those are (part of) open source projects.

Thank you very much for your assistance!

Regards,

Niels Ott


Niels Ott schrieb:

> Dear all,

>

> about a month ago there as a little discussion going on here about

> English lemma lists.

>

> We should have a lemma list for German. There is no special requirement

> but containing lemmata, e.g.

>

> Haus

> Katze

> gehen

> sitzen

>

> Furthermore it would be nice if the list was equipped with POS. But

> that's not a strict requirement.

>

> It would be admirable if this list was free in the sense of free

> speech/open source or if use was restricted to non-commercial

> applications. (This is for a student's project at Univ.)

>

> Thank you very much in advance for your assistance.

>

> Regards,

>

> Niels Ott

>

>


- --
Niels Ott - Computational Linguist (B.A.) - http://www.drni.de/niels/
Tangente: Veralgter Wasservogel
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFF8uNmbosnVosUgx0RAkg/AJ4wKmPcKI3s0aSiDB6OL7QfYJyKfgCeLZ8a
Byz/Td4bitSXc3nUcymTmWw=
=88T4
-----END PGP SIGNATURE-----





More information about the Corpora-archive mailing list