[fpc-devel] String and UnicodeString and UTF8String
Hans-Peter Diettrich
DrDiettrich1 at aol.com
Tue Jan 11 17:50:16 CET 2011
Jonas Maebe schrieb:
>> And we have to deal with Windows, where the default is UTF16.
>
> ... since Delphi 2009 uses (unicode)string everywhere, we need at least also unicode versions.
Since the generic Delphi "string" type can be any Unicode encoding now,
it IMO would be legal to use UTF-8 or UTF-32 for it internally, in FPC.
Some code, expecting UCS2/BMP text only, may become a bit slower due to
according conversions in indexed access to chars, but no other
*implicit* conversions will ever occur. Likewise the generic "char" type
could become a 32 bit type, so that it can hold *every* Unicode codepoint.
For both "string" and "array of char" the "packed" keyword could be used
to distinguish between different bytecount and encoding, where unpacked
types contain UTF-32 chars. This would speed up user code with indexed
access, in contrast to both UTF-8 and -16 encodings, and it would allow
the user to optimize his code for either speed or size. Indexed access
to packed types simply could be disallowed, without breaking anything
since the default is "not packed".
Just some more ideas...
DoDi
More information about the fpc-devel
mailing list