[fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()
Michael Van Canneyt
michael at freepascal.org
Fri Dec 27 10:40:38 CET 2019
On Fri, 27 Dec 2019, Ondrej Pokorny wrote:
> On 27.12.2019 0:19, Michael Van Canneyt wrote:
>> On Thu, 26 Dec 2019, Ondrej Pokorny wrote:
>>
>>> On 26.12.2019 19:29, Michael Van Canneyt wrote:
>>>> So no, I don't think these need to be changed/merged. What IMO can
>>>> be discussed is
>>>> which of these 2 need to be used as the default codepage in other
>>>> code. It
>>>> should then resolve the problems that appear, I think.
>>>
>>> That would be possible as well. But still please reconsider it:
>>> One reason: just from the convention - the default codepage to use
>>> should be TEncoding.Default. That is intuitive.
>>>
>>> Second reason: Now we have TEncoding.ANSI = TEncoding.Default. 2
>>> equal properties. And another FPC-only property
>>> TEncoding.SystemEncoding. That means 3 properties for 2 values.
>>
>> As far as I know, TEncoding.ANSI = CP_ACP.
>
> This is indeed not correct. See
> https://wiki.freepascal.org/FPC_Unicode_support :
> CP_ACP: this value represents the currently set "default system code
> page". See #Code page settings for more information.
I meant the windows meaning of CP_ACP, not what the RTL makes of it.
I think the use of CP_ACP in the RTL is quite dubious.
Using CP_SYSTEM or so would have been better. No doubt again a Delphi
compatibility naming :(
> TMBCSEncoding.Create(widestringmanager.GetStandardCodePageProc(scpAnsi))
This corresponds to what I meant.
>
> and
> TStandardCodePageEnum = (
> scpAnsi, // system Ansi code page (GetACP on windows)
>
> - as you can see the CP_ACP value does not correspond with the GetACP
> WinAPI call result. (But this is wanted as documented in
> https://wiki.freepascal.org/FPC_Unicode_support ).
>
>> Why should this equal TEncoding.Default ?
>
> sysencoding.inc:
>
> class function TEncoding.GetDefault: TEncoding;
> begin
> Result := GetANSI;
> end;
I know it is currently so, the question is : why ? :)
Maybe Default is better SystemEncoding, see below.
>
>> I think TEncoding.Default = CP_UTF8 on linux ?
>
> Yes, in FPC this is correct. Also TEncoding.ANSI =CP_UTF8 on linux in FPC.
Not necessarily, if I read the wiki page correctly.
>
>
>> The main problem I see is that there is the system (OS) encoding, and the
>> encoding specified by DefaultSystemCodePage.
>>
>> These do not necessarily agree. So it makes sense to have 2
>> TEncodings: one
>> for the system encoding, one for the DefaultSystemCodePage variable. They
>> will not be equal.
>>
>> If they were, then the DefaultSystemCodePage variable makes no sense
>> whatever.
>
> Yes, indeed. Therefore I suggested
> * TEncoding.Default for the DefaultSystemCodePage variable
> and
> * TEncoding.ANSI for the system encoding.
>
> Currently we have
> * TEncoding.SystemEncoding for the DefaultSystemCodePage variable
> and
> * both TEncoding.ANSI and TEncoding.Default for the system encoding.
> (TEncoding.ANSI and TEncoding.Default are equal in FPC.)
In that case, why not simply change:
class function TEncoding.GetDefault: TEncoding;
begin
Result := GetSystemEncoding;
end;
Nothing need be removed. I consider SystemEncoding a better name than Default,
and the latter should only be kept for Delphi compatibility. IMHO it would be
better to avoid Default, in fact I would change references to Default to
SystemEncoding for clarity. Default is completely non-descriptive.
If I understand your reasoning correct, that should solve the problems you
report ?
Michael.
More information about the fpc-devel
mailing list