On Thu, Apr 14, 2011 at 2:50 AM, maxwell <maxwell at umiacs.umd.edu> wrote:
> Laurence Anthony <anthony0122 at gmail.com> wrote:
>> Basically, all (pre Win 7?) windows systems had their
>> own legacy encodings, which varied from country to country.
>> So, even if you have a file saved as UTF8, the file *name*
>> is saved in the legacy encoding.
> Are you sure? I thought NTFS filenames were Unicode:
> http://en.wikipedia.org/wiki/Ntfs (see "Allowed characters in filenames")
> --and NTFS superseded the older FAT filesystem as of Windows NT.
> Mike Maxwell
It's a good question. I think the underlying OS stores everything as Unicode but then each system has a locale setting that's set to things like the legacy ShiftJIS here in Japan. It's also related to the Windows code page problem. See below: http://en.wikipedia.org/wiki/Windows_code_page.
So, you never know what the encoding will be when you want to open files. If anybody has any advice on this, I would be very grateful! Laurence.