[fpc-devel] Unicode and UTF8String

Felipe Monteiro de Carvalho felipemonteiro.carvalho at gmail.com
Mon Dec 1 23:08:43 CET 2008


On Mon, Dec 1, 2008 at 7:26 PM, Marco van de Voort <marcov at stack.nl> wrote:
>> A string whose encoding is unknown is very inconvenient for
>> developers.
>
> I don't see that so strongly as most.

I know I spoke only generally, but it's hard to speak about and
foresee the effects of something which currently doesn't exist (a
cross-platform library using a string type whose encoding is unknown).
Everyone out there has choosen an encoding. wxWidgets went with
utf-16, Gtk+ went with utf-8. Who is choosing a string with unknown
encoding?

I see only some cases where the string is a class, but those are very
different from what is proposed.

> It is btw not just about performance, but also about predictability. Less
> encodings in use, means better preditability.

I think it's the other way around. If you know what encoding is
expected you have a more predictable result. You know where a
conversion will take place. For example:

MyUTF8String := MyRTLString;

So we get an error that characters are being lost, but only in Windows
... ummm, it turns out there was a problem with the conversion routine
in the RTL.

> To be honest, I think a case for LCL follows widget set encoding could also
> be made.

It was already investigated that the conversion time is negletible
compared to the paint time.

Using RTLString increases the problems for developers, because they
need to identify when they need to do something which require knowing
the encoding and increases the size of the code to add the conversion.
Remember that we are expecting to build software in a RAD way.

So in one hand you have no substantial gain and in the other some annoyance.

-- 
Felipe Monteiro de Carvalho



More information about the fpc-devel mailing list