[fpc-devel] Unnecessary string copy from Utf8String to AnsiString if destination CP is UTF8
Jonas Maebe
jonas at freepascal.org
Sun Apr 28 12:35:04 CEST 2019
On 28/04/2019 09:55, Ondrej Pokorny wrote:
> IMO there is an unnecessary Move() operation in fpc_AnsiStr_To_AnsiStr
> if (orgcp=cp).
>
> fpc_AnsiStr_To_AnsiStr creates a copy of the AnsiString even if the
> destination and source codepages are equal. See:
>
> program AnsiUtf8;
> var
> Utf8Str: UTF8String;
> RawStr: RawByteString;
> Str: string;
> begin
> DefaultSystemCodePage := CP_UTF8;
> Utf8Str := 'hello';
> Str := Utf8Str; // this makes a copy (fpc_AnsiStr_To_AnsiStr -> Move)
>
> RawStr := 'hello';
> SetCodePage(RawStr, CP_UTF8, False);
> Str := RawStr; // this doesn't make a copy
> end.
>
> Is there a reason for this?
It's probably what Delphi does as well. The result is that the refcount
of a string after such an assignment is currently always one. I've had
my share for now fighting with people who rely on implementation details
(like this is one), so I'd rather not change that unless Delphi does it
too (and even then we may get complaints that FPC is not backwards
compatible in this respect).
> See the attached patch.
Your patch will return an empty string if orgcp is different from both
cp and CP_NONE.
Jonas
More information about the fpc-devel
mailing list