[fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()
Ondrej Pokorny
lazarus at kluug.net
Fri Dec 27 01:52:21 CET 2019
On 27.12.2019 0:19, Michael Van Canneyt wrote:
> On Thu, 26 Dec 2019, Ondrej Pokorny wrote:
>
>> On 26.12.2019 19:29, Michael Van Canneyt wrote:
>>> So no, I don't think these need to be changed/merged. What IMO can
>>> be discussed is
>>> which of these 2 need to be used as the default codepage in other
>>> code. It
>>> should then resolve the problems that appear, I think.
>>
>> That would be possible as well. But still please reconsider it:
>> One reason: just from the convention - the default codepage to use
>> should be TEncoding.Default. That is intuitive.
>>
>> Second reason: Now we have TEncoding.ANSI = TEncoding.Default. 2
>> equal properties. And another FPC-only property
>> TEncoding.SystemEncoding. That means 3 properties for 2 values.
>
> As far as I know, TEncoding.ANSI = CP_ACP.
This is indeed not correct. See
https://wiki.freepascal.org/FPC_Unicode_support :
CP_ACP: this value represents the currently set "default system code
page". See #Code page settings for more information.
The code for it is in sysos.inc:
function TranslatePlaceholderCP(cp: TSystemCodePage): TSystemCodePage;
{$ifdef SYSTEMINLINE}inline;{$endif}
begin
TranslatePlaceholderCP:=cp;
case cp of
CP_OEMCP:
TranslatePlaceholderCP:=GetOEMCP;
CP_ACP:
TranslatePlaceholderCP:=DefaultSystemCodePage;
end;
end;
Whereas TEncoding.ANSI is the WIN-ANSI OS encoding:
class function TEncoding.GetANSI: TEncoding;
// ...
FStandardEncodings[seAnsi] :=
TMBCSEncoding.Create(widestringmanager.GetStandardCodePageProc(scpAnsi))
and
TStandardCodePageEnum = (
scpAnsi, // system Ansi code page (GetACP on windows)
- as you can see the CP_ACP value does not correspond with the GetACP
WinAPI call result. (But this is wanted as documented in
https://wiki.freepascal.org/FPC_Unicode_support ).
> Why should this equal TEncoding.Default ?
sysencoding.inc:
class function TEncoding.GetDefault: TEncoding;
begin
Result := GetANSI;
end;
> I think TEncoding.Default = CP_UTF8 on linux ?
Yes, in FPC this is correct. Also TEncoding.ANSI =CP_UTF8 on linux in FPC.
> The main problem I see is that there is the system (OS) encoding, and the
> encoding specified by DefaultSystemCodePage.
>
> These do not necessarily agree. So it makes sense to have 2
> TEncodings: one
> for the system encoding, one for the DefaultSystemCodePage variable. They
> will not be equal.
>
> If they were, then the DefaultSystemCodePage variable makes no sense
> whatever.
Yes, indeed. Therefore I suggested
* TEncoding.Default for the DefaultSystemCodePage variable
and
* TEncoding.ANSI for the system encoding.
Currently we have
* TEncoding.SystemEncoding for the DefaultSystemCodePage variable
and
* both TEncoding.ANSI and TEncoding.Default for the system encoding.
(TEncoding.ANSI and TEncoding.Default are equal in FPC.)
Ondrej
More information about the fpc-devel
mailing list