[fpc-devel] Delphi new AnsiStrings are incredibly broken :-(
Hans-Peter Diettrich
DrDiettrich1 at aol.com
Thu Oct 13 15:25:19 CEST 2011
After the test program, sent by Paul, I was playing more with
AnsiStrings in Delphi XE, with catastrophic results :-(
At least when MBCS enter the scene, and UTF-8 is widely used in FPC and
Lazarus and is the preferred string type on Linux, incredible bugs show
up. With
var
a: AnsiString;
u, u2: UTF8String;
even a simple assignment of
u := 'ü';
results in an string of Length 1, the second byte has gone away.
A direct comparison of
a := 'äöüü';
u := a;
WriteLn(a = u);
will show True or False - dunno why.
Also
Pos('ü', u); //0
Pos(a[3], u); //0
Since u := a[3] doesn't work,
u2 := Copy(u, 5, 2); //'ü'
WriteLn(Pos(u2, u)); //5 - correct!
but
WriteLn(Pos(string(u2), u)); //3
returns the index in the UnicodeString, into which u was silently converted.
After all these flaws I see no use for Delphi compatible AnsiString
procedures, in an environment where MBCS (UTF-8) strings must be handled.
If we ever want to proceed with the new AnsiStrings, we should specify
what every RTL procedure exactly *should* do, in a meaningful and usable
way, regardless of Delphi compatibility.
E.g. Pos() should convert the first argument (SubStr) to the encoding of
the second string, if both are different, before searching for the
SubStr. Only then the result can be used as an index into the second string.
Now we can guess whether the flawed handling of AnsiStrings in Delphi is
due to sloppyness of the implementors, or necessary conversions are not
performed for speed reasons. Whenever a chance exists, that Pos or other
standard functions must convert an argument, the use of strings with
different encodings becomes very questionable, performance-wise. Perhaps
it would be better (and easier to implement) when only one string type
is used in an application, with possible values of native/UTF-8/UTF-16.
Required conversions then can be restricted to I/O methods (file
encoding), ShortString conversions (wich codepage???), and external
subroutines (OS, widgetsets).
DoDi
More information about the fpc-devel
mailing list