[fpc-devel] Unnecessary string copy from Utf8String to AnsiString if destination CP is UTF8

Ondrej Pokorny lazarus at kluug.net
Sun Apr 28 23:28:16 CEST 2019


On 28.04.2019 22:53, Jonas Maebe wrote:
> On 28/04/2019 22:21, Ondrej Pokorny wrote:
>> This difference results in the bug #35461 I mentioned above.
>
> Indeed. It's also documented (see the last remark on 
> https://freepascal.org/docs-html/ref/refse24.html , although there 
> seems to be something wrong with the formatting). Changing that would 
> slow down the use of ansistring constants a lot though (which most 
> people don't like either).

OK, I understand now. We have the assignment:

S1 := S2;

There are (among others) these 2 cases:

1.) If S2 is a constant, S1 will point to a read-only buffer and the 
refcount will be -1. This is wanted because "changing that would slow 
down the use of ansistring constants a lot". It is not 
implementation-compatible with Delphi.

2.) If S2 is a UTF8String and S1 is an AnsiString with CP=UTF8, S1 gains 
a new copy of S2 with refcount = 1. This is wanted so because it is 
implementation-compatible with Delphi and people may depend on this. It 
slows down the use of the UTF8String type a lot.

This seems like a great ambivalence to me. But no problem, I'll change 
all method parameters that expect UTF8 strings from UTF8String to 
RawByteString and manually take care that only UTF8 data is fed to them.

Thanks
Ondrej




More information about the fpc-devel mailing list