[fpc-devel] new string - question on usage

Jonas Maebe jonas.maebe at elis.ugent.be
Tue Oct 11 00:18:06 CEST 2011


On 11 Oct 2011, at 00:06, Luiz Americo Pereira Camara wrote:

> On 10/10/2011 17:56, Jonas Maebe wrote:
>> On 10 Oct 2011, at 22:11, Luiz Americo Pereira Camara wrote:
>> 
>>> 1- Most of LCL must be code page agnostic, so not use UTF8String/AnsiString directly (keep String)
>> There is no difference between ansistring and string in {$mode delphi} and {$mode objfpc}
> 
> OK.
> There's just one problem using $mode to define the string behavior: say you have a component written in {$mode delphi}. Than code written in {$mode delphiunicode} uses that library.

That is no more a problem than using code using string in {$h-} mode with code using string in {$h+} mode.

>> . In a future delphiunicode mode or something like that string will be unicodestring
> 
> What about the Marco proposition of having separated versions of RTL/Classes  for UTF8 / UTF16? Or did i miss something?

That would not change the meaning of the "string" type. The code in rtl/classes would then use a custom string type (RTLString or whatever) that is defined as either an utf8string or a unicodestring based on some define.

>> , but that's not "code-page agnostic" either. The only somewhat code page agnostic string type is RawByteString.
> 
> I dont mean string type unicode agnostic. I mean code unicode agnostic, i.e., will work regardless of the code page

Generally, the only string types that will always work regardless of the used code pages are utf8string and unicodestring (and maybe some utf32string type, although afaik that's just a dynamic array type). Any other string type except for RawByteString will result in code page conversions that may be lossy (and RawByteString itself has its own share of gotchas to watch out for if you use it for anything else than parameters).

>>> 2- It should have (dont know if currently has) a compiler switch to change the default code page to UTF8 or whatever, so all variables with type String will map to UTF8String.
>> I doubt that such a feature will be added. If you want that, declare your own string type with whatever default code page you want to use and use that type everywhere.
> 
> Ok. This in practice will force Lazarus to go to UTF16 since renaming all string types of LCL from String to UTF8String is a no-no, at least for me.

I really don't see how adding a feature to the compiler to change the default definition of the string type would change anything. As I said, you can achieve exactly the same result by using a custom defined string type.


Jonas


More information about the fpc-devel mailing list