[fpc-devel] Unicode resource strings

Tue Aug 21 11:41:05 CEST 2012

On Tue, 21 Aug 2012 11:17:24 +0200
Aleksa Todorovic <alexione at gmail.com> wrote:

> On Tue, Aug 21, 2012 at 9:53 AM, Martin Schreiber <mse00000 at gmail.com> wrote:
> > Am 21.08.2012 09:31, schrieb Graeme Geldenhuys:
> >
> >
> > Ehm, I did both. In the beginning MSEgui switched from Widestring to utf-8
> > encoded Ansistring because of the buggy FPC widestring implementation
> > (MSEgui started with Delphi/Kylix). Some weeks later I switched back to
> > widestring and bite the bullet to write FPC bug reports until it reached
> > usable stability.
> >
> >
> >  But if you are such a UTF-16 (actually UCS-2 as
> >>
> >> that is what MSEgui supports) fan, why isn't MSEgui source code stored
> >> in UTF-16 encoding either? ;-)
> >
> >
> > Sure, MSEgui uses utf-8 for external storing and exchanging text data.
> > Internal all is 16 bit UnicodeString. Use the best encoding for the task.
> > ;-)
> 
> +1
> 
> There are lot of encodings around, but for different areas of application:
> - external text assets could be in any encoding (system-locale
> encoding, UTF8, UTF16 both BE and LE - for example, MS Excel export
> UTF16 text file)
> - Windows system calls are UTF16, on (most of) other platforms UTF8
> - input translation (physical keyboard to Unicode character)
> - internal application representation is choice of developer
> 
> The problem here is that libraries floating around (including RTL and
> FCL) use different string types (UnicodeString, UTF8String,
> AnsiString), so the question is - is it possible to (re)write those
> libraries in a generic way (RawByteString?), so they can work with any
> string type?

Theoretically you could rewrite the FCL to support UTF8String,
UnicodeString and AnsiString. But not at the same time. In an
application there is always be only one of them. So you have to ship for
each flavor a whole FCL plus all packages that depends on it.
I guess the FPC team wants to support at most one legacy and one
Unicode version. And eventually only the Unicode version.

> In my experience, only about 1% of applications requires handling of
> individual Unicode characters (input, rendering, GUI text editing).
> Other parts of application can happily without that knowledge :-)

True.
But that 1% may be scattered around the whole application and there
are no compiler warnings, so it is hard to find all places.

Mattias