[Corpora-List] Chinese Tokenization

Xu Jiajin ustcxujj at gmail.com
Tue Aug 14 14:37:46 CEST 2012


Hi Ajay,

You can get a copy of ICTCLAS tokeniser developed by Dr. Kevin Zhang at http://www.ictclas.org/ictclas_download.aspx.

ICTCLAS is one of the best Chinese tokenisers.

Jiajin XU Ph.D., associate professor National Research Centre for Foreign Language Education Beijing Foreign Studies University

On Tue, Aug 14, 2012 at 8:17 PM, Ajay <ajay0221 at gmail.com> wrote:


> Dear Corpora list members,
>
> I am looking for Chinese Tokenization and Chinese Lemmatizer tool to
> tokenize Chinese Wikipedia text.
> Please suggest a open-source, and freely available tool.
>
> Regards,
> Ajay Dubey
> M.S. by Research
> SIEL, IIIT, Hyderabad
>
>
>
> <http://www.google.com/imgres?imgurl=http://admissions.iiit.ac.in/logo_name.gif&imgrefurl=http://admissions.iiit.ac.in/admission_procedure.php&usg=__9ccHkzRxJdf9UV-7HNUbLjy0KYQ=&h=91&w=324&sz=19&hl=en&start=18&sig2=W2CiCzBOQyPJFhCggrbDSA&zoom=1&tbnid=5zVpQ8aNlkzftM:&tbnh=63&tbnw=226&ei=qP53TLSJDsnQccn7qOcF&prev=/images%3Fq%3DIIIT-H%26hl%3Den%26sa%3DX%26prmdo%3D1%26biw%3D1307%26bih%3D576%26tbs%3Disch:1&itbs=1&iact=hc&vpx=152&vpy=322&dur=876&hovh=72&hovw=259&tx=164&ty=45&oei=o_53TOanCYfKcIyU-eIF&esq=2&page=2&ndsp=19&ved=1t:429,r:13,s:18>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2151 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20120814/96e12ca9/attachment.txt>



More information about the Corpora mailing list