[Corpora-List] Search returns no results (Chinese txt files)

bbs lists bbs.lists at gmail.com
Sat Apr 15 06:08:17 CEST 2017


You have to make sure that the Character Encoding setting under Global Settings matches the actual character encoding of the text. For example, if your text is not in Unicode and it's in simplified Chinese, you might want to try setting the text encoding to Chinese (euc-cn), one of the encoding methods for simplified Chinese. If it's in Unicode (e.g. UTF8), set it to UTF8, etc. If you don't know how to find out the encoding scheme of the text, just try different Chinese standards and see which one works.

Also, it would be helpful if your Chinese text has word boundaries separated.

For more Chinese-related questions, you might want to try www.corpus4u.org, which is major forum for Chinese corpus linguistics enthusiasts.

Hope this helps.

Hongyin Tao UCLA

On Thu, Apr 13, 2017 at 11:42 AM, Nazarena Fazzari < nazarena.fazzari at gmail.com> wrote:


> Dear all,
> I am sweating on txt files in Chinese. I upload the files on Antconc, and
> then I would like to check the concordance of a few terms, but the search
> returns no results or very few results. Then I check the files (File view)
> and I find a lot of occurrences.
> Any suggestions?
>
> Thanks!
>
> Nazarena
>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 2314 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20170415/5719bf3a/attachment.txt>



More information about the Corpora mailing list