[fpc-devel] Encoded AnsiString

Hans-Peter Diettrich DrDiettrich1 at aol.com
Tue Jan 7 20:22:41 CET 2014


Jonas Maebe schrieb:
> On 07 Jan 2014, at 15:35, Hans-Peter Diettrich wrote:

>> 2) The stupid conversion to CP_ACP in an assignment *to* an
>> RawByteString should be dropped. This applies in detail to the
>> assignment to *function results*.
> 
> The conversion does not happen for all assignments, it only happens
> for concatenations that are assigned to RawByteString. And even then
> it doesn't always happen. Please read the wiki page I wrote (trying
> to prevent exactly this kind of wrong statements from being further
> repeated, and obviously failing).

I've tested the behaviour, and it appears not only in assignments to 
RawByteStrings. See test case below.


>> Test case: function conc(a,b: RawByteString): UTF8String; begin
>> Result := a+b; end;
> 
> This will always return CP_UTF8 on FPC. Does it really return CP_ACP
> on Delphi? Even if it does, I doubt we will change that.

This leads me back to my previous statement: it will be simpler to do 
things right, than trying to achieve compatibility with *all* Delphi 
flaws. In detail when the Delphi flaws never have been documented...

> We even
> couldn't easily do that, because we don't know the static code pages
> of the strings that are concatenated inside the RTL routine that
> handles this.

Right! Only the compiler can do that, and therefore the compiler should 
do it right.

>> Then TStrings could be based on such RawByteStrings, without excess
>> conversions or losses.
> 
> The problem with changing TStrings from AnsiString to RawByteString
> is not so much related to the behaviour of RawByteString, but more
> regarding descendent classes in existing third party (= user) source
> code that override methods using AnsiString parameters. We don't want
> to force everyone to rewrite their code so it uses RawByteString (if
> anything, RawByteString should probably be used as little as possible
> in user code, because always correctly dealing with all possible code
> pages is very hard).

Right <sigh>

>> Sorting (TStringList) eventually should ignore the dynamic
>> encoding, i.e. work on a strictly binary (byte-by-byte) base.
> 
> Looking for just one second at the definition of the Sort methods of
> TStringList (and TStrings) would have prevented you from writing the
> above statement, which does not make any sense whatsoever (unless you
> want the compiler to start changing all code where a programmer
> passes a comparison function that does take code pages into account
> to the Sort methods of TStrings/TStringList).

Fine that you took the bait ;-)

DoDi




More information about the fpc-devel mailing list