[fpc-devel] Encoded AnsiString
Hans-Peter Diettrich
DrDiettrich1 at aol.com
Tue Jan 7 20:22:41 CET 2014
Jonas Maebe schrieb:
> On 07 Jan 2014, at 15:35, Hans-Peter Diettrich wrote:
>> 2) The stupid conversion to CP_ACP in an assignment *to* an
>> RawByteString should be dropped. This applies in detail to the
>> assignment to *function results*.
>
> The conversion does not happen for all assignments, it only happens
> for concatenations that are assigned to RawByteString. And even then
> it doesn't always happen. Please read the wiki page I wrote (trying
> to prevent exactly this kind of wrong statements from being further
> repeated, and obviously failing).
I've tested the behaviour, and it appears not only in assignments to
RawByteStrings. See test case below.
>> Test case: function conc(a,b: RawByteString): UTF8String; begin
>> Result := a+b; end;
>
> This will always return CP_UTF8 on FPC. Does it really return CP_ACP
> on Delphi? Even if it does, I doubt we will change that.
This leads me back to my previous statement: it will be simpler to do
things right, than trying to achieve compatibility with *all* Delphi
flaws. In detail when the Delphi flaws never have been documented...
> We even
> couldn't easily do that, because we don't know the static code pages
> of the strings that are concatenated inside the RTL routine that
> handles this.
Right! Only the compiler can do that, and therefore the compiler should
do it right.
>> Then TStrings could be based on such RawByteStrings, without excess
>> conversions or losses.
>
> The problem with changing TStrings from AnsiString to RawByteString
> is not so much related to the behaviour of RawByteString, but more
> regarding descendent classes in existing third party (= user) source
> code that override methods using AnsiString parameters. We don't want
> to force everyone to rewrite their code so it uses RawByteString (if
> anything, RawByteString should probably be used as little as possible
> in user code, because always correctly dealing with all possible code
> pages is very hard).
Right <sigh>
>> Sorting (TStringList) eventually should ignore the dynamic
>> encoding, i.e. work on a strictly binary (byte-by-byte) base.
>
> Looking for just one second at the definition of the Sort methods of
> TStringList (and TStrings) would have prevented you from writing the
> above statement, which does not make any sense whatsoever (unless you
> want the compiler to start changing all code where a programmer
> passes a comparison function that does take code pages into account
> to the Sort methods of TStrings/TStringList).
Fine that you took the bait ;-)
DoDi
More information about the fpc-devel
mailing list