[fpc-pascal] UnicodeString and Length() function

Jonas Maebe jonas.maebe at elis.ugent.be
Fri Mar 25 20:26:59 CET 2016


On 25/03/16 20:21, Graeme Geldenhuys wrote:
> Length() returns the number of bytes, correct?

It returns the number of ansi/widechars. In case of ansichars, that 
equals the number of bytes.

> So why isn't the result 8 and 14?  The letter o with acute is 2-bytes in
> UTF8 ($C3 & $B4).

That depends on whether the character is composed or decomposed (this 
depends on your text editor and its settings, not on the compiler). If 
it's decomposed, then you get (an) extra character(s) for the decomposed 
"ยด" following the "o" (2 bytes in utf-8, 1 widechar in utf-16).


Jonas




More information about the fpc-pascal mailing list