[fpc-devel] Unnecessary string copy from Utf8String to AnsiString if destination CP is UTF8

Jonas Maebe jonas at freepascal.org
Sun Apr 28 12:35:04 CEST 2019


On 28/04/2019 09:55, Ondrej Pokorny wrote:

> IMO there is an unnecessary Move() operation in fpc_AnsiStr_To_AnsiStr 
> if (orgcp=cp).
> 
> fpc_AnsiStr_To_AnsiStr creates a copy of the AnsiString even if the 
> destination and source codepages are equal. See:
> 
> program AnsiUtf8;
> var
>    Utf8Str: UTF8String;
>    RawStr: RawByteString;
>    Str: string;
> begin
>    DefaultSystemCodePage := CP_UTF8;
>    Utf8Str := 'hello';
>    Str := Utf8Str; // this makes a copy (fpc_AnsiStr_To_AnsiStr -> Move)
> 
>    RawStr := 'hello';
>    SetCodePage(RawStr, CP_UTF8, False);
>    Str := RawStr; // this doesn't make a copy
> end.
> 
> Is there a reason for this?

It's probably what Delphi does as well. The result is that the refcount 
of a string after such an assignment is currently always one. I've had 
my share for now fighting with people who rely on implementation details 
(like this is one), so I'd rather not change that unless Delphi does it 
too (and even then we may get complaints that FPC is not backwards 
compatible in this respect).

> See the attached patch.

Your patch will return an empty string if orgcp is different from both 
cp and CP_NONE.


Jonas



More information about the fpc-devel mailing list