[fpc-devel] Unicode support (yet again)

Graeme Geldenhuys graemeg.lists at gmail.com
Wed Sep 14 11:32:49 CEST 2011


On 14/09/2011 11:04, Felipe Monteiro de Carvalho wrote:
> 
> IMHO a platform-dependent string would be the worse solution of all
> ... far worse then migrating to UTF-16.

I don't see why?  Use the RTL functions to manipulate your text strings.
Both the string  and RTL functions will use the same encoding on each
platform - so no problems, no conversions.

If you really needed to know the encoding, the RTL could include a
helper function to tell you the encoding of any string (just like Delphi
2009+ has).


> Just recently I had a student from my university implement a routine
> which converts HTML text from utf-8 to braille in utf-8 ... I didn't

Again, no problem. The HTML should have specified the encoding it is in.
Normally that would be UTF-8. So under Linux, MacOSX etc it will already
be in the native encoding. Under Windows, text is normally stored in
UTF-8, contrary to UTF-16 being the encoding off the native Windows API.
So loading the file you can compare the HTML file encoding to the
current RTL encoding and do a conversion if needed (same as is required
in Delphi).

As for the text-to-braille functionality, that is outside the scope of
the FPC and RTL. But common sense should prevail, use RTL string
functions to implement your conversion - don't assume 1 byte = 1
character. A unicode aware string iterator could be implemented to help
you step through the characters one at a time. Such a string iterator
could even become part of the RTL as it will probably be used often for
many parsers.


Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/




More information about the fpc-devel mailing list