[Corpora-List] Chinese Tokenization

Craig Pfeifer craig.pfeifer at gmail.com
Wed Aug 15 16:51:43 CEST 2012


At the risk of asking a question I already know the answer to:

Is there a place where we could maintain a list of NLP resources (applications, components, code snippets, corpora) so that we can accumulate all the wonderful links to that go fleeting through our inboxes?

Perhaps start a wikipedia page? Perhaps an offshoot of: http://en.wikipedia.org/wiki/Natural_language_processing

or perhaps on the ACL wikie? http://aclweb.org/aclwiki/index.php?title=Main_Page ______________ craig.pfeifer at gmail.com

On Tue, Aug 14, 2012 at 8:37 AM, Xu Jiajin <ustcxujj at gmail.com> wrote:
> Hi Ajay,
>
> You can get a copy of ICTCLAS tokeniser developed by Dr. Kevin Zhang at
> http://www.ictclas.org/ictclas_download.aspx.
>
> ICTCLAS is one of the best Chinese tokenisers.
>
> Jiajin XU
> Ph.D., associate professor
> National Research Centre for Foreign Language Education
> Beijing Foreign Studies University
>
> On Tue, Aug 14, 2012 at 8:17 PM, Ajay <ajay0221 at gmail.com> wrote:
>>
>> Dear Corpora list members,
>>
>> I am looking for Chinese Tokenization and Chinese Lemmatizer tool to
>> tokenize Chinese Wikipedia text.
>> Please suggest a open-source, and freely available tool.
>>
>> Regards,
>> Ajay Dubey
>> M.S. by Research
>> SIEL, IIIT, Hyderabad
>>
>>
>>
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>



More information about the Corpora mailing list