[Corpora-List] AntConc 3.2.2 released for Windows and Mac OS X

Alexander Yeh asy at mitre.org
Thu Apr 14 03:46:27 CEST 2011


Laurence Anthony wrote:
> Hi Mike,
>
> On Thu, Apr 14, 2011 at 2:50 AM, maxwell<maxwell at umiacs.umd.edu> wrote:
>> Laurence Anthony<anthony0122 at gmail.com> wrote:
>>> Basically, all (pre Win 7?) windows systems had their
>>> own legacy encodings, which varied from country to country.
>>> So, even if you have a file saved as UTF8, the file *name*
>>> is saved in the legacy encoding.
>>
>> Are you sure? I thought NTFS filenames were Unicode:
>> http://en.wikipedia.org/wiki/Ntfs (see "Allowed characters in filenames")
>> http://msdn.microsoft.com/en-us/library/dd317748%28v=vs.85%29.aspx
>> --and NTFS superseded the older FAT filesystem as of Windows NT.

Even with Windows XP, I sometimes use disks, etc. formatted in FAT: both a Mac and Windows can read/write on FAT. With NTFS, Macs used to be (perhaps still are) read-only. So when I need a drive, etc. read/writable by both a Mac and Windows, I have used FAT.

Thanks -Alex


>>
>> Mike Maxwell
>>
>
> It's a good question. I think the underlying OS stores everything as
> Unicode but then each system has a locale setting that's set to things
> like the legacy ShiftJIS here in Japan. It's also related to the
> Windows code page problem. See below:
> http://en.wikipedia.org/wiki/Windows_code_page.
>
> So, you never know what the encoding will be when you want to open
> files. If anybody has any advice on this, I would be very grateful!
> Laurence.
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list