[fpc-devel] String and UnicodeString and UTF8Stringt
LacaK
lacak at zoznam.sk
Wed Jan 12 07:16:40 CET 2011
>
> ...: the new ansistring type has a hidden "element size" field (in
> addition to the reference count, length and codepage), and from what I
> can see at page 10 of
> http://edn.embarcadero.com/article/images/38980/Delphi_and_Unicode.pdf,
> Delphi 2009's unicodestring is simply an ansistring(1200).
So it seems, that if we will have any "GenericString", with properties
"reference count", "size", "character width", "codepage", then all other
string types can be based on this string type. So other strings will be
only any "shortcuts", and internaly will use same structure:
AnsiString = GenericString(with actual system ANSI code page (0) ... or
... without any explicit codepage ($ffff))
UTF8String = GenericString(with UTF-8 encoding)
UnicodeString = GenericString(with UTF-16 encoding)
So it seems to me, that there is agreement on adding "character width",
"codepage" to internal "string" record structure and provide conversions
where needed, isn't it ? (more or less same approach like in Delphi)
Where is not agreement, it is fact what should be default string
encoding (AnsiString($ffff) or UTF-8 or UTF-16 or UTF-32)
So if I revert to my original question ... is there any agreement on
some points related to "future of String type" ?
P.S. I still does not understand, how can things work correctly if LCL
expect that all AnsiStrings (String) are UTF8Strings, byt RTL/FCL does
not strictly follow this (at least in Windows) ?
-Laco.
More information about the fpc-devel
mailing list