[Corpora-List] R or Python: German lemmatizer / tokenizer / PoS-Tagger?

Amir Zeldes Amir.Zeldes at georgetown.edu
Fri Mar 25 15:23:21 CET 2016

Hi Hanjo,

There's a Python wrapper for TreeTagger, which comes with freely available models for German tagging and lemmatization:


Amir ------------ Dr. Amir Zeldes Asst. Prof. of Computational Linguistics Department of Linguistics Georgetown University 1437 37th St. NW Washington, DC 20057


-----Original Message----- From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Amy Isard Sent: Wednesday, March 23, 2016 11:15 To: corpora at uib.no Subject: Re: [Corpora-List] R or Python: German lemmatizer / tokenizer / PoS-Tagger?


I haven't used the German module of CLipS http://www.clips.ua.ac.be/pages/pattern-de but I have used the English one and it is easy to understand and well-documented - it's written in Python.


On 23/03/16 14:50, Hanjo Hamann wrote:
> Dear all,
> a colleague of mine routinely uses R packages to annotate English
> texts, but has struggled to find any package for texts in German
> language. Do any of you use tried-and-proven packages for either R or
> Python that provide various annotation features (lemmata, tokens, PoS)
> for German (web forum chat) texts?
> Best regards,
> Hanjo

-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

_______________________________________________ UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora Corpora mailing list Corpora at uib.no http://mailman.uib.no/listinfo/corpora

More information about the Corpora mailing list