[fpc-devel] Trying to understand the wiki-Page "FPC Unicode support"

Michael Schnell mschnell at lumino.de
Thu Nov 27 10:03:34 CET 2014


On 11/26/2014 05:25 PM, Sven Barth wrote:
>
>
> >
> > So seemingly you could do MyStringType       = type 
> AnsiString(CP_UTF16), and seemingly the size information is set 
> according to this.
>
> No, you can't, because the RTL does not handle that. For AnsiString 
> the element size is *always* 1. It's hardcoded. AFAIK Delphi even does 
> a compile error if you use CP_UTF16.
>
>
Thanks for the clarification.

I now understand that the "Element Size" field in the String header is 
quite dummy, as under the hood there are two completely separate 
concepts for one-byte-Strings and 2-Byte Strings and none for other 
Element sizes.

This to me is not obvious at all, as the language syntax and the String 
header data structure suggest a more universal paradigm for multiple 
string type brands, that each have an "element-size"6 and 
"code-ID-number" setting, handled by a common infrastructure.

The "universal paradigm" would allow for extensions (e.g. UTF-32, 
multiple 16 Bit Code pages, an additional fully dynamic String type, 
n-byte "un-encoded" string types), as I described in the Wiki page.

The "dual mode" concept of course does not provide such extensibility, 
and so I stop thinking about this (and bothering the community), and am 
happy that it just works as it is.

Thanks again,
-Michael



More information about the fpc-devel mailing list