[fpc-devel] Unnecessary string copy from Utf8String to AnsiString if destination CP is UTF8

Jonas Maebe jonas at freepascal.org
Sun Apr 28 22:53:05 CEST 2019


On 28/04/2019 22:21, Ondrej Pokorny wrote:
> On 28.04.2019 20:22, Jonas Maebe wrote:
>> On 28/04/2019 14:10, Ondrej Pokorny wrote:
>>> If changing a string via a PChar is not allowed in FPC than the 
>>> argument with refcount is not really valid.
>>
>> It's only for string constants.
> 
> Str is not a string constant but it is a variable:
> 
> program PCharTest;
> var
>    Str: AnsiString;
>    P: PAnsiChar;
> begin
>    Str := 'hello';
>    P := PAnsiChar(Str);
>    P[1] := 'x'; // SIGSEGV in FPC, OK in Delphi
> end.
> 
> If Str points a read-only buffer, FPC has to recreate it and assign a 
> read/write buffer at latest at this line:
> 
>    P := PAnsiChar(Str);
> 
> I reported it: https://bugs.freepascal.org/view.php?id=35461

FPC has never supported this. Before 
https://bugs.freepascal.org/view.php?id=24088 was fixed (in 2013), this 
changed the string constant (so that further uses of the same string 
constant anywhere in the program would result in using this changed 
value). After that was fixed, it has always crashed.

> I don't think that anybody's code depends on the fact that
> String1 := String2;
> generates once a copy of String2 and another time it only increases the 
> refcount, depending on what 8-bit string types are used on both sides of 
> the assignment.

You'd be surprised on what small details people depend to 
(micro-)optimise their code, in this case to e.g. by not adding 
(currently) unnecessary UniqueString calls.

> Furthermore, you still have an implementation difference with Delphi 
> about the const string assignment:
> 
> program ConstStringTest;
> var
>    Str: AnsiString;
> begin
>    Str := 'hello';
>    Writeln(PInteger(PByte(Str) - 8)^); // 1 in Delphi, -1 in FPC
>    ReadLn;
> end.
> 
> This difference results in the bug #35461 I mentioned above.

Indeed. It's also documented (see the last remark on 
https://freepascal.org/docs-html/ref/refse24.html , although there seems 
to be something wrong with the formatting). Changing that would slow 
down the use of ansistring constants a lot though (which most people 
don't like either).


Jonas



More information about the fpc-devel mailing list