[fpc-devel] utf8 reading

Uberto Barbini uberto at ubiland.net
Thu Mar 10 19:29:40 CET 2005


> UCS-2 or UTF-16 how it called by the unicode consortium is "escaped" as
> well and you've to take care of it in your code. 

mmh, no. 
UCS-2 is different from utf-16 (which is escaped), but you cannot represent 
all utf characters (see the case of Vogon poetry).

See:
http://www.uazone.com/multiling/unicode/ucs2.html
http://lists.samba.org/archive/jcifs/2002-July/000969.html

> Even in utf-32 you've 
> to take care of surrogate pairs.

In utf-32 yes, in UCS-4 no.
Teorically we could have a UCS-8 in the future, but for the next hundred years 
this is not very likely.

> > Using natively utf-8 I think is impossible, because the encoding.
>
> Why?

Because every simple function on strings (like copy) should require to start 
reading the string from beginning

Bye Uberto




More information about the fpc-devel mailing list