I have JusText (version 2.1 on some machines, 2.2 on others) installed on several Windows machines (Server 2012, Server 2012 R2, Windows 10), and I’m having a problem with JusText crashing on about 40% of all files, due to encoding issues. The error message I get is:
File "c:\python32\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table) UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position …..: character maps to <undefined>
Just one example of a page that is causing it to crash:
I've tried every possible combination of
as well as every possible encoding on the files, and it's still crashing on about 40% of all files.
Again, sorry to post this to CORPORA, but hopefully someone might have some suggestions.
============================================ Mark Davies Professor of Linguistics / Brigham Young University http://davies-linguistics.byu.edu/ ** Corpus design and use // Linguistic databases ** ** Historical linguistics // Language variation ** ** English, Spanish, and Portuguese ** ============================================