[Corpora-List] Rule-based toolkit RDRPOSTagger for POS and morphological tagging

Dai Quoc Nguyen nquocdai at gmail.com
Wed Apr 6 15:56:35 CEST 2016


(Apologies for cross-posting) *********************************************************************** We are pleased to announce the release of RDRPOSTagger (version 1.2.1).

RDRPOSTagger is a robust, easy-to-use and language-independent toolkit for POS and morphological tagging. It employs an error-driven approach to automatically construct tagging rules in the form of a binary tree. The main properties of RDRPOSTagger are as follows:

- RDRPOSTagger obtains fast performance in both learning and tagging

process. For example, RDRPOSTagger achieved tagging speeds of 5K and 90K

English word tokens/second computed for single threaded implementations in

Python and Java respectively, using a computer with Core2Duo 2.4GHz.

- RDRPOSTagger achieves a very competitive accuracy in comparison to the

state-of-the-art results. Please see experimental results including

training time, tagging speed and tagging accuracy for 13 languages in the

following paper:

A Robust Transformation-Based Learning Approach Using Ripple Down Rules for Part-Of-Speech Tagging <http://content.iospress.com/articles/ai-communications/aic698>. *AI Communications*, to appear. [CameraReadyVersion.pdf] <http://arxiv.org/abs/1412.4021>

- RDRPOSTagger supports pre-trained POS and morphological tagging models

for 13 languages including Bulgarian, Czech, Dutch, English, French,

German, Hindi, Italian, Portuguese, Spanish, Swedish, Thai and Vietnamese.

Please find more information about RDRPOSTagger at its website: http://rdrpostagger.sourceforge.net

Best regards, RDRPOSTagger development team -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2970 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20160406/7c8a4a3a/attachment.txt>



More information about the Corpora mailing list