[fpc-devel] Unicode support (yet again)

Tomas Hajny XHajT03 at hajny.biz
Fri Sep 16 17:19:07 CEST 2011


On Fri, September 16, 2011 14:03, Marco van de Voort wrote:
> In our previous episode, Tomas Hajny said:
>>  .
>> > In the UTF8 RTL, all "string"s _ARE_ utf8, unless specified otherwise
>> (by
>> > naming them unicodestring or ansistring(..encoding) or shortstrings).
>> >
>> > So the same virtual method with a STRING parameter will be
>> TUnicodestring
>> > in the UTF16 rtl and UTF8string in the utf8 RTL.
>>
>> Sorry, one thing I'm missing in this point - where exactly is the
>> indexed
>> (SBCS codepage based) version in this if string always means either
>> UnicodeString or UTF8String depending on the context / defines? Would
>> there be no SBCS version any longer, or is this a third option, or what?
>
> It is a third option but only maybe for a while on Windows and the only
> option platforms that can't or won't support unicode.  (like Dos)
>
> The idea is more or less that this trick can be employed for any
> ansi/unicodestring type. The shorstring overloads are already there and
> probably can stay.  So Dos or OS/2 are not in danger.

Understood. I was not asking about shortstring but mostly about
"ansistring" and "string".


> It also means three possible options for Windows. But ascii is temporary,
> and Windows/utf8 is only for Lazarus.  (which I hope will see the light
> and
> migrate to utf16 in time too)
 .
 .

I guess that it's to be seen how much "temporary" is it for Windows, but
that does not matter really for me personally.


>> Was your point about "string", or "RTLString"?
>
> I'm thinking about "string", but that is more directed towards the OOP
> parts, which assume a objfpc{$h+} or Delphi mode.
>
> So the base RTL functions like fileopen will be rawbytestring that accepts
> _all_ encodings, (so also ansi/utf8/utf16 in ansi/utf8/utf16 mode) and
> runtime convert if necessary and possible or explicitely typed.  It
> depends
> on the amount of routines that don't fall into these categories if
> something
> like RTLSTRING is possible.
>
> Of course the "accepting" of all encodings is on interface levels.
> Implementations for platforms are dos are not supposed to support them
> all,
> just the ones they always did.

OK, I see. I assume that an option to override the "string" meaning
(similarly to current $H+/-) among ansi/utf8/utf16 would be created then
(if not already available in cpstrnew), right?

Tomas





More information about the fpc-devel mailing list