On 13/08/12 14:35, Jeff Elmore wrote:
> I'm curious what folks are using these days for sentence segmenting for
> English.
>
> My application involves narrative and informational texts at a variety
> of reading levels and genres. Most text is hand-edited to eliminate
> non-prose content but any system that could respond robustly to unedited
> text would be awesome, of course.
>
> Mostly we've been using hand-crafted tools written in Python. I have
> checked out what NLTK offers but from what I've seen there's not
> anything terribly accurate in it (fails on obvious common cases like
> some honorifics). We did develop a decision tree based model using Weka
> for Spanish text. I'd be happy to do this again for English but wanted
> to see if there's something good already out there.
>
> Thanks in advance!