[fpc-devel] Unicode support (again)
mschnell at lumino.de
Mon Nov 10 16:48:38 CET 2008
I found that the current FPC does have Unicode support, but there are
- WideStrings work fine with Unicode UCS-2 but they (of course) have
similar issues as UTF8-Strings when surrogate codes are used (which is
rarely necessary in Europe and America).
- FPC does not have a dedicated type "UTF8String", but the type defined
as "UTF8String" is just the same as ANSIString and thus the compiler
can't decide which is meant by the programmer and can't create the
appropriate code when it's necessary to distinguish between them (e.g
when it automatically should converting between locale-coded ANSIString,
UTF8String and WideString)
- by design (for speed sake), UTF8String (and WideString when surrogate
codes are used) count in subcodes and not in Unicode-Characters, so the
behavior is "unexpected" when doing things like s[i], pos(s), copy(),
delete(), ... There are not _slow_ functions that do the "expected"
versions of s[i], pos(s), copy(), delete(), ... (I've yet to find out
how I can print just the first character of an UTF8String :)
- there is no decent "character" type for UTF8 or UTF16 coded
Characters (WideChar (UCS2 code) works if no surrogate codes are used.)
- there are different option on how the compiler expects the coding of
the source file. Seemingly if it detects it to be UTF8 coded and a
certain (otherwise correct) option is set, even "s := 'hallo äöü'; "
does not work correctly as expected if s is a WideString. (Lazarus with
default settings suffers from this problem).
More information about the fpc-devel