[fpc-pascal] UnicodeString and Length() function
Jonas Maebe
jonas.maebe at elis.ugent.be
Fri Mar 25 20:26:59 CET 2016
On 25/03/16 20:21, Graeme Geldenhuys wrote:
> Length() returns the number of bytes, correct?
It returns the number of ansi/widechars. In case of ansichars, that
equals the number of bytes.
> So why isn't the result 8 and 14? The letter o with acute is 2-bytes in
> UTF8 ($C3 & $B4).
That depends on whether the character is composed or decomposed (this
depends on your text editor and its settings, not on the compiler). If
it's decomposed, then you get (an) extra character(s) for the decomposed
"ยด" following the "o" (2 bytes in utf-8, 1 widechar in utf-16).
Jonas
More information about the fpc-pascal
mailing list