[fpc-devel] UPD2 cpstring proposals

Alex Shishkin alexvins at mail.ru
Thu Oct 13 11:41:47 CEST 2011


12.10.2011 15:29, Alex Shishkin пишет:
>
> My proposed changes to spstring.
> 1) if string is defined w/o explicit encoding (f.e. just "string", in
> H+ modeswitch or "ansistring") it treated as RawByteString.
> 2) In unicode Delphi mode encoding of all string constant values is
> forced to UTF16, source encoding can be any. String variables forced to
> UTF16.
> But most unidode Delphi code could be compiled in simple Delphi mode.
> 3) all RTL string routines should be encoding aware (accept
> RawByteString). No need to
> separate unicode versions.
>
> 4) UTF8String, RawByteString, UnicodeString are aliases but not unique
> types.
> 5) concatenation of 2 rawbytestrings converts right operand to left`s
> encoding.
> 6) may be use concept of "universal string" from my previous message.
> "ansistring" (w/o explicit encoding) = RawByteString + clause "5".
>
> In fine, main idea is to use rawbytestings as widely as possible, but
> avoid data corruption (perform codepage conversion when it absolutely
> necessary).
>

7) string indexing
if string is "universal" indexing is always byte-based (compatible to 
delphi xe ansistring and legacy ansistring). So s[i] is alwayse i`th byte.
For UTF16 string indexing word-based of course. Indexing of uft8string 
is the question (i`th byte of i`th unicode - cardinal - character).





More information about the fpc-devel mailing list