On 11/12/2010 07:14 PM, Yannick Versley wrote:
> Most of my own software that I use for CPU- or memory-intensive computation
> uses bits of Cython in them (aka it would be awesome if PyPy could talk to
> cpdef functions in Cython modules and automagically optimize away the
> boxing/unboxing at the PyPy/Cython boundary),
I guess this is getting off-topic for the list, but of course the hope is that with PyPy you don't actually need Cython often because it's fast enough :-). Not that we are quite there yet...
> but here are two
> examples of code that will probably fit your bill in that it can read
> in data and
> will use more memory when you feed it with more data:
> * The DECCA toolkit looks at sequences of POS tags and word sequences
Yes, this looks very good. Will look into it. Any quick pointers for a corpus I could use?
> NLTK is probably a very good testbed for using PyPy on it since
> * it comes with its own data, so there's no need to hunt for datasets
> or produce synthetic data
> * it's actually written with clarity in mind and probably contains less
> squeezing-the-last-drops-of-performance-out-of-CPython code than
> other projects.
Oh yes, thanks, I had already planned to look into NLTK more.