[fpc-pascal] FPC vs Delphi's unicode string support questions

Sven Barth pascaldragon at googlemail.com
Sat Aug 18 22:33:16 CEST 2012


On 18.08.2012 16:15, Jürgen Hestermann wrote:
> The few things I know about are:
>
>
> Am 2012-08-18 15:54, schrieb Graeme Geldenhuys:
>> 1) Is it correct that String <> AnsiString any more?
>
> Well, it never was. At least "string" could also be a shortstring, maybe
> other strings too meanwhile (I don't know).

"String" can mean either "ShortString", "AnsiString" or "UnicodeString" 
depending on the compiler settings:

Non-Delphi modes and $H- (default): ShortString
Delphi mode: AnsiString
DelphiUnicode mode: UnicodeString
Non-Delphi modes and $H+: AnsiString
Non-Delphi modes and $H- and modeswitch "unicodestrings": (AFAIK) 
ShortString
Non—Delphi modes and $H+ and modeswitch "unicodestrings": UnicodeString

>> 4) What Unicode encoding is used? UTF-8 or UTF-16?
>
> AFAIK UTF-8 is the prefered unicode string as it is used in the IDE and
> also many libraries (but not all). I am not sure what is planed for the
> future though.

If the target is Delphi compatibility then the default string type will 
be UnicodeString and thus the encoding will be UTF-16 (or UCS-2 to be 
more correct...).

>> 5) Is it only the compiler that has unicode string type support. Has
>> anything been done or started in the RTL?
>
> The RTL still uses the ancient 255 char one-byte array.
> This works ok if paths are not longer than 255 chars and
> when you convert all strings to the (one-byte) string type
> that is used by the API of the OS (ANSI for Windows and I think UTF-8
> for Linux).

The RTL mostly uses PChar to not be restricted to 255 characters 
(exceptions are ancient compatibility units like DOS, Objects, etc.). 
There are often overloads for ShortString and AnsiString though.

Regards,
Sven




More information about the fpc-pascal mailing list