Both Shonagon and Chunagon are the web interfaces to retrieve BCCWJ.
You can use Shonagon without registration, but it offers only a simple search function (No POS search). You need to register to use Chunagon, where you can conduct more advanced searches using POS tags.
And if you like to access the text data directly, you need to buy the DVD edition.
Basic info about DVD edition of BCCWJ (in English) http://pj.ninjal.ac.jp/corpus_center/bccwj/en/dvd-index.html
Info about obtaining the DVD edition (in Japanese) http://pj.ninjal.ac.jp/corpus_center/bccwj/assets_c/2015/09/bccwj-chart-3818.html
Related documents are available from the link below: http://pj.ninjal.ac.jp/corpus_center/bccwj/subscription.html
No download version.
Shin
Dr. Shin Ishikawa Kobe University, Japan iskwshin at gmail.com
2017-07-08 6:45 GMT+09:00 Darren Cook <darren at dcook.org>:
> Can someone tell me how to download the BCCWJ corpus? There is a
> "shonagon" and a "chunagon" link (*), but the chunagon page describes
> itself as a web application. So I guessed the shonagon was the download;
> but it seems to just be an online search engine for the corpus. Is there
> no free download, and it is only available on the DVDs?
>
> (I just wanted to reproduce the output of an open source tokenizer that
> used BCCWJ for its training data, as a baseline for any improvements or
> bug fixes I might make.)
>
> Thanks,
>
> Darren
>
> *: http://pj.ninjal.ac.jp/corpus_center/bccwj/en/
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora