[fpc-devel] simple UTF tests
Marco van de Voort
marcov at stack.nl
Mon Jan 9 11:09:37 CET 2012
In our previous episode, Michael Schnell said:
> > An ansistring is always 8-bit.
> Sorry I can't follow here.
> Of course the term "ANSI" suggests 8 bit, but it also suggest one
> visible character = 8 bit, thus non UTF.
No, it means that the encoding granularity is 8-bit. Length returns encoding
granularity, not codepoints (always 32-bit, encoded in sequences of 8
(ansistring) or 16 (widestring,uncidoestring) bits) or printable characters
(possibly multiple codepoints)
> If a type called "ANSI..." is used to hold UTF codes, the term ANSI is
> abused anyway and now the handling of the type can be defined in any way
> that seems appropriate,
Whatever the name is, in all current Unicode Delphi versions and FPC
ansistring means 8-bit string exclusively.
More information about the fpc-devel