[fpc-devel] String and UnicodeString and UTF8String

Hans-Peter Diettrich DrDiettrich1 at aol.com
Tue Jan 11 17:50:16 CET 2011


Jonas Maebe schrieb:

>> And we have to deal with Windows, where the default is UTF16.
> 
> ... since Delphi 2009 uses (unicode)string everywhere, we need at least also unicode versions.

Since the generic Delphi "string" type can be any Unicode encoding now, 
it IMO would be legal to use UTF-8 or UTF-32 for it internally, in FPC. 
Some code, expecting UCS2/BMP text only, may become a bit slower due to 
according conversions in indexed access to chars, but no other 
*implicit* conversions will ever occur. Likewise the generic "char" type 
could become a 32 bit type, so that it can hold *every* Unicode codepoint.

For both "string" and "array of char" the "packed" keyword could be used 
to distinguish between different bytecount and encoding, where unpacked 
types contain UTF-32 chars. This would speed up user code with indexed 
access, in contrast to both UTF-8 and -16 encodings, and it would allow 
the user to optimize his code for either speed or size. Indexed access 
to packed types simply could be disallowed, without breaking anything 
since the default is "not packed".

Just some more ideas...

DoDi




More information about the fpc-devel mailing list