[fpc-devel] FPC 2.3.1 seems a mixed mess with Unicode support

Marco van de Voort marcov at stack.nl
Thu Sep 17 10:22:12 CEST 2009


In our previous episode, Michael Schnell said:
[ Charset ISO-8859-1 unsupported, converting... ]
> Jonas Maebe wrote:
> > 
> > Neither that much space nor that much time is required.
> 
> Any pointers regarding a decent estimation ?

http://www.stack/nl/~marcov/unicode.jpg

there is a v5 now though.
 
> As there are billions of possible Unicode "characters" and most of them
>  potentially can be alternately depicted by one or multiple
> multi-Unicode surrogates, I don't share your optimism.

It's not that much. Probably in the case of non-reducable multi-codepoint
chars they probably simply order the various codepoints in some fixed way in
canonical form, drastically reducing the number of combinations.

However that still doesn't solve equivalent chars. That really needs
language dependant tables. Charset and language dependant interpretation of
it are two different things.



More information about the fpc-devel mailing list