[Corpora-List] Cyrillic tokenizer and sentence splitter

George Mitrevski mitrege at auburn.edu
Thu May 12 23:52:01 CEST 2005


Hi folks.

Can anyone reccomend a good perl sentence splitter and tokenizer that
works well with Cyrillic characters/texts (Russian, Bulgarain, etc.)?
I've tried some for English, German and other langauges, but they don;t
do well with Cyrillic.

thanks,

George.

Foreign Languages tel. 334-844-6376
6030 Haley Center fax. 334-844-6378
Auburn University
Auburn, AL 36849
home: www.auburn.edu/~mitrege





More information about the Corpora-archive mailing list