[Corpora-List] BCCWJ corpus

Darren Cook darren at dcook.org
Fri Jul 7 23:45:50 CEST 2017


Can someone tell me how to download the BCCWJ corpus? There is a "shonagon" and a "chunagon" link (*), but the chunagon page describes itself as a web application. So I guessed the shonagon was the download; but it seems to just be an online search engine for the corpus. Is there no free download, and it is only available on the DVDs?

(I just wanted to reproduce the output of an open source tokenizer that used BCCWJ for its training data, as a baseline for any improvements or bug fixes I might make.)

Thanks,

Darren

*: http://pj.ninjal.ac.jp/corpus_center/bccwj/en/



More information about the Corpora mailing list