[fpc-devel] Unicode support (yet again)
Graeme Geldenhuys
graemeg.lists at gmail.com
Fri Sep 16 11:31:43 CEST 2011
On 16/09/2011 00:01, Dimitri Smits wrote:
>
> errrm, utf-8 can have 6 octets representing one character,
Last time I checked, that was only in the very early stages of
developing the utf-8 specification. Since then, the maximums size of a
utf-8 code point is 4 bytes.
If you know otherwise, please post a URL. Here is the information I have:
"The original specification allowed for sequences of up to six bytes,
covering numbers up to 31 bits (the original limit of the Universal
Character Set). In November 2003 UTF-8 was restricted by RFC 3629 to
four bytes covering only the range U+0000 to U+10FFFF, in order to match
the constraints of the UTF-16 character encoding."
http://en.wikipedia.org/wiki/UTF-8#History
> not forgetting those dioretics that are separate characters.
I'm representing a code point in TfpgChar. If you want the "completed
character as is displayed on the screen", then simply normalize your
TfpgString first, then extract the "character".
Regards,
- Graeme -
--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/
More information about the fpc-devel
mailing list