[Corpora-List] python's NLTK vs R's TM

Steven Bird sb at csse.unimelb.edu.au
Tue Aug 21 12:22:36 CEST 2012

On 21 August 2012 19:52, Christian Pietsch <chr.pietsch at googlemail.com> wrote:
> As you know, a Python 3 version of NLTK has not been released yet.
> Early last year, the NLTK team said they were waiting for NumPy to be
> ported to Python 3. Then somebody told them that had already happened.
> I do not know what their current excuse is ;-)

We're mostly there, as you can see, thanks to the voluntary efforts of Mikhail Korobov and others. There's some sticky issues with supporting Python 2 and 3 from the same codebase, issues with character encodings and unicode, and issues with our test framework. We're slowly resolving this, but depend on volunteers.

Now that NLTK has moved to GitHub, it is easier than ever for people to contribute new functionality and bugfixes. NLTK is downloaded nearly 300 times a day, so any contributions that people care to make are likely to have a significant impact.

Here are the places to start: https://github.com/nltk/nltk (the home of NLTK development) https://groups.google.com/forum/?fromgroups#!forum/nltk-dev (developers) https://groups.google.com/forum/?fromgroups#!forum/nltk-users (end users)

-Steven Bird

More information about the Corpora mailing list