[fpc-pascal] Funny things about utf-8 strings on mac

Florian Klaempfl florian at freepascal.org
Wed Jun 13 12:00:09 CEST 2007


Jonas Maebe schrieb:
> 
> On 13 jun 2007, at 11:26, Florian Klaempfl wrote:
> 
>>> Sorry, but this view is too "terminal-centric" as far as I am concerned.
>>> That's not something you want to tell users of a GUI app. Or even
>>> programmers, for that matter. I really don't see a reason why this
>>> should not be configurable by the programmer himself.
>>
>> Well, then something with the design is wrong. Ansistrings are per
>> definition strings which use the default 8 bit encoding of the
>> environment.
> 
> The problem is the definition of "environment". What libiconv considers
> as environment (some terminal environment variables) does not
> necessarily match the api's you are using in your program.

This is a matter of the cwstrings unit now.

> 
>> Putting always utf-8 into them is abusing them and that's
>> why there is an utf8string type in the system unit.
> 
> I'm not saying that they should always contain utf-8, but that the
> programmer should be able to control this. It's also not just about
> explicitly using ansistrings, but also about constant strings. If you have
> 
> api_which_expects_utf8_string(p: pchar)
> 
> and do
> 
> api_which_expects_utf8_string('łóżka')

The problem is that the compiler needs conversion rules how to interpret
this and the compiler internally knows only widestrings and no utf8 or
whatever strings. Maintaining different encodings in the compiler is imo
also too much.

> 
> then you'd like a way for that constant string to be passed as utf-8 in
> all cases without needing to put utf8encode() calls everywhere in your
> program (especially if all api routines I use expect utf-8, which is
> pretty much the case on Darwin/Mac OS X).

Then you have to use the utf8string type. If it doesn't work good
enough, we've to fix it.



More information about the fpc-pascal mailing list