[fpc-devel] simple UTF tests

Marco van de Voort marcov at stack.nl
Mon Jan 9 11:09:37 CET 2012


In our previous episode, Michael Schnell said:
> > An ansistring is always 8-bit. 
> Sorry I can't follow here.
> 
> Of course the term "ANSI" suggests 8 bit, but it also suggest one 
> visible character = 8 bit, thus non UTF.

No, it means that the encoding granularity is 8-bit. Length returns encoding
granularity, not codepoints (always 32-bit, encoded in sequences of 8
(ansistring) or 16 (widestring,uncidoestring) bits) or printable characters
(possibly multiple codepoints)
 
> If a type called "ANSI..." is used to hold UTF codes, the term ANSI is 
> abused anyway and now the handling of the type can be defined in any way 
> that seems appropriate,

Whatever the name is, in all current Unicode Delphi versions and FPC
ansistring means 8-bit string exclusively.




More information about the fpc-devel mailing list