[fpc-devel] simple UTF tests

Michael Schnell mschnell at lumino.de
Thu Jan 5 12:33:25 CET 2012


With Lazarus on Linux, I did some simple tests with UTF strings.

I found that the length of an "AnsiString(CP_UTF16)" is given in terms 
of bytes and not of Words. Is this like it should ?


I found that pchar(s8) with an UTF-8 string works as expected, giving a 
pointer to the UTF-8 encoded byte array.
Anyway: is it obvious, that the encoding of pchar is UTF-8 ? Is this 
portable ?


p16 = pchar(s16) with an UTF-16 gives a pointer to the first byte of the 
word array, so (with ASCII text), the second byte is zero, thus a 
C-String length 1. Is this like it should ?
Of course re-assigning p16 to an UTF-16 string does not reproduce the 
original string.

What encoding is to be supposed for a pchar ?



The Debugger does not show UTF-16-Strings correctly (it shows the same 
result as pchar() ). Is this just a Lazarus problem, or does FPC need to 
provide additional support for this ?

-Michael



More information about the fpc-devel mailing list