[fpc-pascal] Console Encoding in Windows (Local VS. UTF8)
Michael Schnell
mschnell at lumino.de
Tue Jul 30 09:55:46 CEST 2013
On 07/30/2013 04:29 AM, Noah Silva wrote:
> No, UTF16 only needs more memory if most of the text is ASCII. It
> actually uses less than UTF8 in the average case for Japanese, for
> example.
Of course you are right here.
>
> Linux OS API in most cases is 8 Bit,
>
>
> I assume by 8bit, you mean variable byte encoding like UTF8.
Yep.
>
> Conversions are very expensive.
>
>
> This is not as bad as some people make it out to be. You have to be
> converting a *lot* of data for it to be noticeable.
That is why I pointed out that the way to select an encoding depends on
how much "calculations" are done on the strings.
But in fact I tend to agree, while the argument why - when converting to
Unicode - the Lazarus team chose to do the LCL API in UTF-8 (while MSE
chose UTF-16 for the same purpose) was exactly this (I never felt
comfortable with that, BTW).
>
> > I suppose this is bound to change once fpc has completed the move to
> "new Delphi Strings".
>
> I really don't think so, the reasons are even well detailed in the Wiki.
I always was told that Delphi compatibility is the primary driving forth
for any modifications. This necessarily suggests this move (which is not
possible before fpc does provides "new Delphi Strings"). But there might
be multiple opinions.
In fact my primary intentions with Lazarus / fpc are not to do my own
generic projects, but to help my colleagues to move their huge Delphi XE
program system to Linux. This in fact needs complete support for "new
Delphi Strings".
> From what I understand, the plan is for strings to store their
> codepage as an attribute internally along with their length, and since
> the compiler/runtime library will know their codepage, it can convert
> as necessary.
That already is ready to use in the svn and is exactly the said "new
Delphi Strings", and - when activated - completely compatible with
Delphi XE. It's rather nice and fast, but Delphi lacks a
_completely_dynamic_encoding_ type with auto-conversion only when
necessary. (IMHO rather easy doable by compiler magic, but "forgotten"
in Delphi XE)
> Either way, you can make your own StringList variants for each type
> easily enough.
Not without compiler support (if you want auto-conversion when necessary).
>
> In fact, I am fine with manual conversions, so long as 99% of
> everything "just works" with UTF8 and/or UTF16.
I'm not fine with TStringList and friends forcing any predefined
encoding. This in fact does work rather nicely without the application
programmer even noticing it. But IMHO a cross platform system like fpc
can be expected to do better, doing away with windowish remains from
Delphi whenever possible.
-Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-pascal/attachments/20130730/29f56c58/attachment.html>
More information about the fpc-pascal
mailing list