[fpc-devel] Unicode proceedings
Hans-Peter Diettrich
DrDiettrich1 at aol.com
Fri Nov 18 18:20:03 CET 2011
Graeme Geldenhuys schrieb:
> On 2011-11-18 12:11, Michael Schnell wrote:
>> Why should a type that is capable of holding multiple different UTF
>> encodings be called "ANSIString". IMHO this is very contra-intuitive.
>
> Every time I see this used in Delphi too, I start to laugh as well. It
> makes no sense. Call the damn thing UnicodeString because that is
> exactly what it is.
I'm not sure, but UTF encodings are not Ansi codepages, IMO. That's why
UTF-16 is not an allowed encoding of an AnsiString, and UTF-8 seems to
be accepted only by accident, is not fully supported by Delphi.
> The other annoyance of Delphi is the assumption that the term "Unicode
> String" always mean UTF-16. A real slap in the face for unicode.org guys.
How that? UTF-16 is a valid Unicode encoding, just as is UTF-8. The
difference between Ansi and Unicode strings only is the range of
characters (codepoints), which are expected in the strings, and which
are supported by the supplied stringhandling procedures.
Did you ever notice that stringhandling was restricted to the system
codepage, before the invention of AnsiStrings with an encoding? Unicode
stringhandling is indpendent from any codepage issues, and can be
implemented for any Unicode (UTF) representation in the same way (API).
While WideString only was a container for UTF-16, without stringhandling
support beyond the WinAPI, such functionality has been added to
UnicodeString. BTW such functionality now is available for AnsiStrings,
too, at least by implicit conversion to and from Unicode strings.
> Can't we just have a single damn string type like Java and some other
> languages. Lets just call it...ummm.... String! ;-)
Isn't this exactly what Delphi did? There is one generic "string" type,
which now finally is a Unicode string - just like in Java?
But Delphi and FPC suffer from legacy types, which still have to be
supported.
DoDi
More information about the fpc-devel
mailing list