[fpc-pascal] Console Encoding in Windows (Local VS. UTF8)

Michael Schnell mschnell at lumino.de
Tue Jul 30 09:55:46 CEST 2013


On 07/30/2013 04:29 AM, Noah Silva wrote:

> No, UTF16 only needs more memory if most of the text is ASCII.  It 
> actually uses less than UTF8 in the average case for Japanese, for 
> example.
Of course you are right here.
>
>     Linux OS API in most cases is 8 Bit,
>
>
> I assume by 8bit, you mean variable byte encoding like UTF8.
Yep.
>
>     Conversions are very expensive.
>
>
> This is not as bad as some people make it out to be.  You have to be 
> converting a *lot* of data for it to be noticeable.
That is why I pointed out that the way to select an encoding depends on 
how much "calculations" are done on the strings.

But in fact I tend to agree, while the argument why - when converting to 
Unicode - the Lazarus team chose to do the LCL API  in UTF-8 (while MSE 
chose UTF-16 for the same purpose) was exactly this (I never felt 
comfortable with that, BTW).

>
> > I suppose this is bound to change once fpc has completed the move to 
> "new Delphi Strings".
>
> I really don't think so, the reasons are even well detailed in the Wiki.
I always was told that Delphi compatibility is the primary driving forth 
for any modifications. This necessarily suggests this move (which is not 
possible before fpc does provides "new Delphi Strings"). But there might 
be multiple  opinions.

In fact my primary intentions with Lazarus / fpc are not to do my own 
generic projects, but to help my colleagues to move their huge Delphi XE 
program system to Linux. This in fact needs complete support for "new 
Delphi Strings".

> From what I understand, the plan is for strings to store their 
> codepage as an attribute internally along with their length, and since 
> the compiler/runtime library will know their codepage, it can convert 
> as necessary.
That already is ready to use in the svn and is exactly the said "new 
Delphi Strings", and - when activated - completely compatible with 
Delphi XE. It's rather nice and fast, but Delphi lacks a 
_completely_dynamic_encoding_ type with auto-conversion only when 
necessary. (IMHO rather easy doable by compiler magic, but "forgotten" 
in Delphi XE)
>  Either way, you can make your own StringList variants for each type 
> easily enough.
Not without compiler support (if you want auto-conversion when necessary).
>
> In fact, I am fine with manual conversions, so long as 99% of 
> everything "just works" with UTF8 and/or UTF16.
I'm not fine with TStringList and friends forcing any predefined 
encoding. This in fact does work rather nicely without the application 
programmer even noticing it. But IMHO a cross platform system like fpc 
can be expected to do better, doing away with windowish remains from 
Delphi whenever possible.

-Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-pascal/attachments/20130730/29f56c58/attachment.html>


More information about the fpc-pascal mailing list