[Corpora-List] BCCWJ corpus
Darren Cook
darren at dcook.org
Fri Jul 7 23:45:50 CEST 2017
Can someone tell me how to download the BCCWJ corpus? There is a
"shonagon" and a "chunagon" link (*), but the chunagon page describes
itself as a web application. So I guessed the shonagon was the download;
but it seems to just be an online search engine for the corpus. Is there
no free download, and it is only available on the DVDs?
(I just wanted to reproduce the output of an open source tokenizer that
used BCCWJ for its training data, as a baseline for any improvements or
bug fixes I might make.)
Thanks,
Darren
*: http://pj.ninjal.ac.jp/corpus_center/bccwj/en/
More information about the Corpora
mailing list