[fpc-devel] String and UnicodeString and UTF8String
Hans-Peter Diettrich
DrDiettrich1 at aol.com
Tue Jan 11 17:10:32 CET 2011
Marco van de Voort schrieb:
> Btw, while looking up rawbytestring I saw this in the Delphi help:
>
> "Declaring variables or fields of type RawByteString should rarely, if ever,
> be done, because this practice can lead to undefined behavior and potential
> data loss."
IIRC RawByteString should be used like OpenString, as subroutine
argument type only. In contrast to the name, a RawByteString has a
variable encoding, i.e. implicit conversions are inserted for every use
with other string types. Thus AnyByteString had been a better name for
that type, IMO. Delphi does no more support (officially) non-textual
data in strings, and TBytes should be used for such data.
> How will you deal with e.g. Windows? Legacy string=ansistring(0), D2009 is
> string=utf16 TUnicodestring?
Is an Delphi UnicodeString really compatible with an WinAPI
WideString/BSTR? AFAIR all BSTRs must reside in shared memory, so that
copies are required for every API call.
> Mainly the question what the classtree will be. The main operating type used
> in applications. You always need two RTLs for that, since it can be 1 or 2
> byte, and even if you fixated it on one byte encodings, rawbytestring would
> force you to write case statements in each and every procedure.
UTF-8 combines an single (byte-based) storage type with lossless
encoding of full Unicode. Ansi and UCS2 (really UTF-16) only *look*
easier to handle in user code, but both will fail and require special
code whenever characters outside the assumed codepage may occur.
DoDi
More information about the fpc-devel
mailing list