[fpc-devel] Unnecessary string copy from Utf8String to AnsiString if destination CP is UTF8
Jonas Maebe
jonas at freepascal.org
Sun Apr 28 22:53:05 CEST 2019
On 28/04/2019 22:21, Ondrej Pokorny wrote:
> On 28.04.2019 20:22, Jonas Maebe wrote:
>> On 28/04/2019 14:10, Ondrej Pokorny wrote:
>>> If changing a string via a PChar is not allowed in FPC than the
>>> argument with refcount is not really valid.
>>
>> It's only for string constants.
>
> Str is not a string constant but it is a variable:
>
> program PCharTest;
> var
> Str: AnsiString;
> P: PAnsiChar;
> begin
> Str := 'hello';
> P := PAnsiChar(Str);
> P[1] := 'x'; // SIGSEGV in FPC, OK in Delphi
> end.
>
> If Str points a read-only buffer, FPC has to recreate it and assign a
> read/write buffer at latest at this line:
>
> P := PAnsiChar(Str);
>
> I reported it: https://bugs.freepascal.org/view.php?id=35461
FPC has never supported this. Before
https://bugs.freepascal.org/view.php?id=24088 was fixed (in 2013), this
changed the string constant (so that further uses of the same string
constant anywhere in the program would result in using this changed
value). After that was fixed, it has always crashed.
> I don't think that anybody's code depends on the fact that
> String1 := String2;
> generates once a copy of String2 and another time it only increases the
> refcount, depending on what 8-bit string types are used on both sides of
> the assignment.
You'd be surprised on what small details people depend to
(micro-)optimise their code, in this case to e.g. by not adding
(currently) unnecessary UniqueString calls.
> Furthermore, you still have an implementation difference with Delphi
> about the const string assignment:
>
> program ConstStringTest;
> var
> Str: AnsiString;
> begin
> Str := 'hello';
> Writeln(PInteger(PByte(Str) - 8)^); // 1 in Delphi, -1 in FPC
> ReadLn;
> end.
>
> This difference results in the bug #35461 I mentioned above.
Indeed. It's also documented (see the last remark on
https://freepascal.org/docs-html/ref/refse24.html , although there seems
to be something wrong with the formatting). Changing that would slow
down the use of ansistring constants a lot though (which most people
don't like either).
Jonas
More information about the fpc-devel
mailing list