On 13 Aug 2012, at 15:35, Jeff Elmore wrote:
> I'm curious what folks are using these days for sentence segmenting for English.
> My application involves narrative and informational texts at a variety of reading levels and genres. Most text is hand-edited to eliminate non-prose content but any system that could respond robustly to unedited text would be awesome, of course.
> Mostly we've been using hand-crafted tools written in Python. I have checked out what NLTK offers but from what I've seen there's not anything terribly accurate in it (fails on obvious common cases like some honorifics). We did develop a decision tree based model using Weka for Spanish text. I'd be happy to do this again for English but wanted to see if there's something good already out there.
> Thanks in advance!
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
-- Florian Leitner, PhD <fleitner.cnio at gmail.com>
Structural Biology and BioComputing Programme Spanish National Cancer Research Centre (CNIO)
Address: C/ Melchor Fernandez Almagro 3; E-28029 Madrid Phone: +34 91 732 8000 Fax: +34 91 224 6980 Internet: http://www.cnio.es
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 3996 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20120814/2f26a712/attachment.txt>