[fpc-devel] Unicode support (again)

Michael Schnell mschnell at lumino.de
Tue Nov 11 13:15:57 CET 2008


> Because e.g. on the ext3 file system, you can have two files with the 
> name "ü" in the same directory. One named using the single character 
> "ü" and one named using as the string "u¨" (both in utf-8). If you 
> make the compiler automatically normalise everything, you lose 
> information (and get the security holes etc).
I see, but as this is not handled decently with good old ANSIStrings, 
anyway, there is not "friendly old school" way that a compiler would be 
able to offer. In these special cases, the user of course needs to 
explicitly handle the upgrade of his project to unicode.

OTOH, in this special case, I don't see why the compiler should 
"normalize" "u¨" to "ü". If the software is supposed to be handling 
unicode, the unicode string "u¨" should be considered a perfectly legal 
two-code-point information consisting of a "u" (a single sub-code in 
UTF-8) and a double-dot (supposedly two subcodes in UTF-8). If the user 
wants to handle this as a single "ü", he should write appropriate code 
for that. Any automation on that is dangerous.

-Michael



More information about the fpc-devel mailing list