[fpc-devel] Unnecessary string copy from Utf8String to AnsiString if destination CP is UTF8

Ondrej Pokorny lazarus at kluug.net
Sun Apr 28 22:21:39 CEST 2019


On 28.04.2019 20:22, Jonas Maebe wrote:
> On 28/04/2019 14:10, Ondrej Pokorny wrote:
>> If changing a string via a PChar is not allowed in FPC than the 
>> argument with refcount is not really valid.
>
> It's only for string constants.

Str is not a string constant but it is a variable:

program PCharTest;
var
   Str: AnsiString;
   P: PAnsiChar;
begin
   Str := 'hello';
   P := PAnsiChar(Str);
   P[1] := 'x'; // SIGSEGV in FPC, OK in Delphi
end.

If Str points a read-only buffer, FPC has to recreate it and assign a 
read/write buffer at latest at this line:

   P := PAnsiChar(Str);

I reported it: https://bugs.freepascal.org/view.php?id=35461


>> It's funny to see that the holy mantra of the "implementation detail" 
>> is used once to support a different behavior and the second time to 
>> fight it :)
>
> When a new feature is added, we always try to be compatible with 
> Delphi as much as possible, even at the implementation level. The 
> reason is that an many Delphi users refuses to accept that there is a 
> difference between a language definition and an implementation 
> decision/detail.
>
> Once a feature has been implemented in FPC, we also try to keep it the 
> same as much as possible (except when needed to fix bugs), because 
> many FPC users are of the same opinion as Delphi users in this regard 
> (and even if they accept it, it's still no fun if your code breaks 
> when you update the compiler, even if it is your own fault).

Thanks for the feedback! Yes, I remember the big discussion about the 
case-else optimization for enumeration types.

But honestly that one (case-else) has been a common programming 
habit/mistake/whatever.

I don't think that anybody's code depends on the fact that
String1 := String2;
generates once a copy of String2 and another time it only increases the 
refcount, depending on what 8-bit string types are used on both sides of 
the assignment.

Furthermore, you still have an implementation difference with Delphi 
about the const string assignment:

program ConstStringTest;
var
   Str: AnsiString;
begin
   Str := 'hello';
   Writeln(PInteger(PByte(Str) - 8)^); // 1 in Delphi, -1 in FPC
   ReadLn;
end.

This difference results in the bug #35461 I mentioned above.

Ondrej




More information about the fpc-devel mailing list