[fpc-pascal] Console Encoding in Windows (Local VS. UTF8)

Noah Silva shiruba at galapagossoftware.com
Sun Jul 14 16:35:56 CEST 2013


Hi,

2013/7/9 Marco van de Voort <marcov at stack.nl>

> In our previous episode, Dennis Poon said:
> > Please state the windows version you are using. XP or Windows 7?
>
> In XP there were separate Far East versions of Windows.  In Vista+ this
> was wholly integrated and there is only one Vista base system with various
> language packs.
>
> This is basically true.


> It might be that some of the Far East apis were deprecated later, but I
> would expect them to run if you configure the backwards compatibility
> options on the shortcut.
>
> There aren't many "far east APIs", basically everything just used a local
variable length (multi byte) encoding that works similarly to UTF8 but only
supports local characters.  For the most part, none of the system APIs
cared what was in a string, they just plastered it somewhere.  Since 1 byte
<> 1 char, but normal code didn't know that, you were safe with things like
Concat, but couldn't safely do things like copy(string,3,4) and expect it
to work.  That's still a problem with FreePascal, but in general, things
like Copy() can work with all character sets not only because of unicode.
 Anything fancy that had to be done before was almost always at the
application layer.

Anyway, if Windows changed the encoding from "local" to UTF8 suddenly,
every single Japanese program that outputs SJIS to the command-line would
break.  In the GUI it is a little different because the system "knows" if
the programs are Unicode or non-Unicode, and there is a setting as to what
encoding to use for non-unicode apps.


> > I deal with chinese in my programs so I know your problems.  The same
> > delphi 5 program works differently on XP and Windows 7.  Looks like
> > Windows 7 has removed support for non unicode (I am not sure whether the
> > Unicode it uses is UTF8, UTF16 or UTF32).
>

This is definitely not the case, plenty of non-unicode programs work on
Windows 7.  They would face a huge revolt.

>
> UTF16 and a bit of UTF8. (e.g. notepad groks utf8 now).
>
> Finally, not if only Excel could save UTF8 or handle UTV8 CSV files...


> > Seems that all filenames in XP are treated as unicode code.
>
> All NT derived systems are Unicode (UCS2, later UTF16) in the heart. The
> whole ansi support is a win32 compatibility layer.
>
> Yes, which is amazing, since in many places there is no unicode support in
user space.
You click on a CSV file and Excel needs it to be SJIS (or whatever), even
though I am sure modern versions of excel are storing it as UTF16
internally anyway.

Sadly it doesn't really matter how good NT is if everything uses the Win32
layer.  (I heard the NT kernel supported hard links and multiple mount
points like Unix too...)

At any rate, it would be nice if we can have FreePascal libraries talk
directly to the Unicode interfaces (even the win32 ones), instead of local
interfaces where possible.  Besides being easier, it would be more
reliable.

Some how I ended up spending 14 hours at work today (a Sunday!), so I
didn't have a chance to look at the hints given by others yet, but
tomorrow...  I may make something like a ConsoleWriteln(const s:UTF8String)
function that "just Works" to the extent possible.

Thank you,
   Noah Silva

> _______________________________________________
> fpc-pascal maillist  -  fpc-pascal at lists.freepascal.org
> http://lists.freepascal.org/mailman/listinfo/fpc-pascal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-pascal/attachments/20130714/f35e15b9/attachment.html>


More information about the fpc-pascal mailing list