[Corpora-List] A Lemmatizer is Required

Sabine Bartsch bartsch at linglit.tu-darmstadt.de
Fri Feb 29 14:45:43 CET 2008


Hi there,

I would suggest you have a look at Helmut Schmid's TreeTagger at the IMS, University of Stuttgart:

http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/

It's a pos-tagger / lemmatizer and works for several languages depending on the parameter files selected. Performance is good, runs under Linux, solaris, MacOS X and Windows.

Best of luck

Sabine

True Friend wrote:
> Hi
> I want an open source (or free at least) lemmatizer which can lemmatize
> a corpus of 2.1 million english words into their base forms etc. If it
> is a linux only software even no problem I've Kubuntu along with windows xp.
> Regards
>
> --
> محمد شاکر عزیز
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- Dr. Sabine Bartsch Technische Universität Darmstadt Institut für Sprach- und Literaturwissenschaft - Englische Linguistik Hochschulstr. 1 64289 Darmstadt Fon: +49-6151-16 4570 Fax: +49-6151-16 3694 http://www.linglit.tu-darmstadt.de/bartsch



More information about the Corpora mailing list