[fpc-devel] cpstrrtl/unicode branch merged to trunk
Michael Schnell
mschnell at lumino.de
Mon Sep 9 14:25:17 CEST 2013
On 09/07/2013 03:00 PM, Sven Barth wrote:
>
> We do NOT want to force UnicodeString upon every target. The world not
> only consists of Windows!
+1 !
Of course a compiler switch to not use the "NewStrings" would be
appropriate.
OTOH IMHO it should be possible to in fact use the "NewStrings" in Linux
with a default encoding of UTF8.
Thus, a decently Delphi compatible definition of the encoding when
defining Strings (not using the aliases provide) could be:
"($0000)" Default encoding (e.g. UTF16 when compiled for Windows and
UTF8 when compiled for Liunx. The RTL OS-centric functions, and in
Lazarus the LCL, internally would avoid many conversions when accessed
with user code using the default encoding either by "($0000)" or
"appropriately" defined strings. Identical to ("$mmmm") with $mmmm being
the default encoding when compiling
"($nnnn)" Delphi compatible, auto converting
"($FFFF)" Delphi compatible raw byte string, not auto converting
"($FFFF, 1)" Not Delphi compatible: identical to "($FFFF)" (after a ",",
the element size is defined; without a "," the element size is set
according to the character code)
"($FFFF, 2)" Not Delphi compatible: raw Word string, not auto converting
"($FFFF, 4)" Not Delphi compatible: raw DWord string, not auto converting
"($FFFF, 8)" Not Delphi compatible: raw QWord string, not auto converting
"("$FFFE)" Not Delphi compatible: dynamically encoded String, auto
converting when necessary.
The codes "($0000)" and "($FFFE)" are never stored within the string
header nor are they known to the Library functions. They only trigger
the appropriate compiler magic. The String headers always contain the
actual encoding type which is fixed for "($0000)"-predefined Strings and
dynamic for "($FFFE)"-predefined strings.
-Michael
More information about the fpc-devel
mailing list