[fpc-devel] FPC 2.3.1 seems a mixed mess with Unicode support

Tue Sep 15 13:53:07 CEST 2009

In our previous episode, Micha Nelissen said:
> >>>   MyChar := MyString[1];
> >>>   writeln(MyChar);
> >>> end.
> >> Extracting a Char from a UnicodeString? What's that supposed to do?
> > 
> > CHAR is a 16-bit wchar in D2009.  Simularly, pchar is a pointer to a 16-bits
> > char. (pansichar being the 1-byte one).

.. and most importantly STRING is unicodestring. So running D2009 unittests
on FPC, or claiming unicode compatibility with D2009 is totally useless atm,
unless we have some clue how we are going to deal with defaults.

(per platform, depending on default granularity of the target, except in
Delphi mode, switches for per unit behaviour etc etc).

Note that _IF_ we really follow D2009, without any additional FPC specific 
stuff, this might mean a complete fork of both FPC and Lazarus into
pre-D2009 and D2009+ modes.

> And if MyChar is declared as a WideChar? Then it does work?

No.

> Isn't it 
> like assigning a LongInt to an Integer? It might be cut, screwed or stay 
> the same (depending on sizeof(Integer)).

No, since string[1] is a 16-bit expression. Delphi string support works with
encoding granularity not with codepoints, or even chars. Only some
specialized functions allow character based access with full Unicode range.

See also http://www.stack.nl/~marcov/unicode.pdf though that could have a
few updates.