[fpc-devel] new string - question on usage

Hans-Peter Diettrich DrDiettrich1 at aol.com
Thu Oct 13 04:27:53 CEST 2011


Graeme Geldenhuys schrieb:
> On 12/10/2011 11:47, Martin Schreiber wrote:
>> idea. Have a look at Firemonkey and you know what I mean. ;-)
> 
> For those unfamiliar with Firemonkey, would you mind explaining further.
> 
> 
> ...but over all, I do agree with your statement, that FPC shouldn't
> follow Delphi blindly. Delphi and VCL is Windows centric - it's whole
> design doesn't fit other platforms. CLX (and I guess Firemonkey) was/is
> different different to VCL for a reason.

FireMonkey is a third-party product, now bought by Embarcadero. That
deal seems to be an attempt to avoid an own development, what already
failed with both Kylix and Delphi.NET.

> Cross platform support needs
> more thought, eg: UTF-8 as native string type under *nix systems, and
> UTF-16 under Windows. Why must some platforms get a speed penalty and
> others not, when you force only one encoding on all platforms?

I don't see a speed penalty in using UTF-16. In contrast to UTF-8 it
simplifies (and consequently speeds up) all string handling. Memory
requirements may be higher with UTF-16, but only with pure ASCII
strings. Required conversions in calls to external subroutines are very
rare in real-life code, IMO.

Did you ever hear Windows users complain about the speed penalty, caused
by the inevitable UTF 8/16 conversions in calls to external (OS...)
subroutines?


> As for you statement regarding "do we need Unicode support everywhere?"
> Well, with Delphi 2009's Unicode support, the Delphi language now
> supports Unicode too. Thus unit names, class names, property names,
> variable names etc can all contain Unicode text in there names.

Ouch, you're right. IMO that's a misfeature :-(

Some people may like to *write* code with Russian or Chinese names, but
the majority of users won't like to *read* such code.

A true Unicode language also should use unique characters (codepoints)
for all keywords, operators and directives, reducing code size and
speeding up the parsing of source code - see e.g. APL. The IDE can
translate these items into replacement strings in source editor windows,
eventually depending on the user language.

> So yes,
> Unicode is required throughout the Object Pascal language, and FPC
> Compiler. You can't have AnsiString only in some places, and Unicode
> support in others. It's all or nothing.

The consequence will be a break, between the old (Ansi) and new
(Unicode) language, RTL etc., with little chances for a common codebase
for the compiler and libraries.

I've been very happy with the old FPC design, supporting Unicode in
UTF-8 strings, without a need for a bunch of new string types, with
different Char sizes and conversions. Dropping the native (system)
codepage strings, and making "string" an UTF8String, would allow to
write Unicode-aware applications with a single RTL, LCL etc.

Under these conditions I also would be happy with a string=UTF16String 
(or UTF32String) system, and an according Char type, with AnsiStrings 
dropped entirely. It's hard enough to support legacy ShortStrings along 
with an dynamic string type.

DoDi




More information about the fpc-devel mailing list