[fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

Ondrej Pokorny lazarus at kluug.net
Fri Dec 27 01:52:21 CET 2019


On 27.12.2019 0:19, Michael Van Canneyt wrote:
> On Thu, 26 Dec 2019, Ondrej Pokorny wrote:
>
>> On 26.12.2019 19:29, Michael Van Canneyt wrote:
>>> So no, I don't think these need to be changed/merged. What IMO can 
>>> be discussed is
>>> which of these 2 need to be used as the default codepage in other 
>>> code. It
>>> should then resolve the problems that appear, I think.
>>
>> That would be possible as well. But still please reconsider it:
>> One reason: just from the convention - the default codepage to use 
>> should be TEncoding.Default. That is intuitive.
>>
>> Second reason: Now we have TEncoding.ANSI = TEncoding.Default. 2 
>> equal properties. And another FPC-only property 
>> TEncoding.SystemEncoding. That means 3 properties for 2 values.
>
> As far as I know, TEncoding.ANSI = CP_ACP.

This is indeed not correct. See 
https://wiki.freepascal.org/FPC_Unicode_support :
CP_ACP: this value represents the currently set "default system code 
page". See #Code page settings for more information.

The code for it is in sysos.inc:
function TranslatePlaceholderCP(cp: TSystemCodePage): TSystemCodePage; 
{$ifdef SYSTEMINLINE}inline;{$endif}
begin
   TranslatePlaceholderCP:=cp;
   case cp of
     CP_OEMCP:
       TranslatePlaceholderCP:=GetOEMCP;
     CP_ACP:
       TranslatePlaceholderCP:=DefaultSystemCodePage;
   end;
end;

Whereas TEncoding.ANSI is the WIN-ANSI OS encoding:

class function TEncoding.GetANSI: TEncoding;
// ...
         FStandardEncodings[seAnsi] := 
TMBCSEncoding.Create(widestringmanager.GetStandardCodePageProc(scpAnsi))

and
   TStandardCodePageEnum = (
     scpAnsi,                 // system Ansi code page (GetACP on windows)

- as you can see the CP_ACP value does not correspond with the GetACP 
WinAPI call result. (But this is wanted as documented in 
https://wiki.freepascal.org/FPC_Unicode_support ).

> Why should this equal TEncoding.Default ? 

sysencoding.inc:

class function TEncoding.GetDefault: TEncoding;
begin
   Result := GetANSI;
end;

> I think  TEncoding.Default  = CP_UTF8 on linux ?

Yes, in FPC this is correct. Also TEncoding.ANSI =CP_UTF8 on linux in FPC.


> The main problem I see is that there is the system (OS) encoding, and the
> encoding specified by DefaultSystemCodePage.
>
> These do not necessarily agree. So it makes sense to have 2 
> TEncodings: one
> for the system encoding, one for the DefaultSystemCodePage variable. They
> will not be equal.
>
> If they were, then the DefaultSystemCodePage variable makes no sense 
> whatever.

Yes, indeed. Therefore I suggested
* TEncoding.Default for the DefaultSystemCodePage variable
and
* TEncoding.ANSI for the system encoding.

Currently we have
* TEncoding.SystemEncoding for the DefaultSystemCodePage variable
and
* both TEncoding.ANSI and TEncoding.Default for the system encoding. 
(TEncoding.ANSI and TEncoding.Default are equal in FPC.)

Ondrej



More information about the fpc-devel mailing list