[fpc-devel] FPC 2.3.1 seems a mixed mess with Unicode support

Michael Schnell mschnell at lumino.de
Wed Sep 16 09:00:24 CEST 2009


Marco van de Voort wrote:
> In our previous episode, Michael Schnell said:
>> If we really want a "character", MyChar would need to be a 32-Bit thing,
>> and (in case of UTF, the [n] notation would need to scan the Unicode
>> byte stream to find it, but I don't know if it's implemented in that way.)
> 
> Afaik a character in the unicode sense can consist out of multiple
> codepoints. (e.g. for languages that have many possibilities of combining
> "accents" where there doesn't exist a glyph for every combination)
> 
> So a character (as something that prints a whole) can consist out of
> multiple 32-bit values (codepoints)

Even Worse !!!

So  "Unicode Character" does not make sense at all.

I suppose converting a combined character into a single character is not
possible as it would need a huge table.

But if this conversion is possible (even if not in all cases)
theoretically but not practically, this means that there is _no_ way to
determine if Unicode strings are identical.

This makes programming a profoundly obscene adventure and we better
should start breeding cattle instead.

Obviously combined Unicode characters are code from hell and should be
banned completely :( .

-Michael




More information about the fpc-devel mailing list