[fpc-devel] Unicode support (yet again)
Graeme Geldenhuys
graemeg.lists at gmail.com
Wed Sep 14 11:48:49 CEST 2011
On 14/09/2011 11:19, Luiz Americo Pereira Camara wrote:
> This is not desirable simply because at each platform (windows / unix)
> the user code of the same program will have a different encoding
> increasing the possibility of subtle errors.
Why? Not every program is a text manipulation program or text parser.
Most programs simply assign one string to another.
eg:
Button1.Caption := 'Click me';
lMyString := Button1.Caption;
Under unix systems 'Click me', Button1.Caption and lMyString will be a
UTF-8 encoded. Under Windows 'Click me', Button1.Caption and lMyString
will be UTF-16 encoding.
When Lazarus saves this information in a .lfm file, it will be stored as
UTF-8 irrespective of the platform. This is normal behaviour on all
platforms already, and already done in Lazarus too.
As for streaming, the same applies as for saving to file. UTF-8 is
ideally suited for (and was designed for simplifying) streaming, hence
the W3C promotes the usage of UTF-8 in HTML, XML etc.
> Another advantage of using RTLString as i proposed is that Lazarus will
> require almost no code change since the encoding of string in LCL will
> be the same (UTF8) across platforms.
Lazarus, like fpGUI will have to decide what they want to do. Stick to
having UTF-8 forced on all platforms, or use a native encoding on each
platform. Currently UTF-8 was choosen in both project because it is so
compatible (think easy here) with AnsiString - so least amount of work
was required and it was pretty efficient because most programs already
used AnsiString.
If I was to change fpGUI to use a native encoding on each platform, I
would simply change my definition of TfpgString as described in a
similar example before. All string manupulation inside fpGUI (and LCL)
should already have adhered to the rule that 1 byte <> 1 character, so
the rest of the framework should continue to work as normal. In the case
of fpGUI, I would also be able to get rid of all the UTF8Copy(),
UTF8Length() calls and simply use the RTL Copy() and Length() functions
again - after all, they were only introduced because FPC's RTL lacked
Unicode (any encoding) support.
Regards,
- Graeme -
--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/
More information about the fpc-devel
mailing list