[fpc-pascal] Warning not to use the "String" type with FPC 3.x

Jonas Maebe jonas.maebe at elis.ugent.be
Mon May 9 12:28:38 CEST 2016


Graeme Geldenhuys wrote:
> My reasoning for the above. Data loss during string conversions!
>
> 1. The "String" type has too many meanings. I recommend you simply stop
>     using it in your code. Instead, use the exact type you really mean or
>     support in your application.

The same is true in previous FPC versions.

> 2. AnsiString in FPC 3.x is now code page enabled, and can have many
>     different meanings.
>     There is no guarantee that assigning a UTF8String or UnicodeString
>     to a AnsiString is safe.

The same is true in previous FPC versions.

>     On some systems the AnsiString might
>     default to a UTF-8 or UTF-16 encoding which is fine, but importantly,
>     on other systems in might default to something else, and then you
>     simply loose data during the conversion. What makes it even worse,
>     is that the default encoding for AnsiString is determined at
>     runtime,

The same is true in previous FPC versions.

>     so the programmer is completely helpless in preventing
>     this.

Unlike in previous FPC versions, in FPC 3.x the programmer can specify 
the encoding that ansistring should use at run time via the 
SetMultiByteConversionCodePage().

>     This issue is also not specific to only certain operating
>     systems. You can have a Linux or FreeBSD system that defaults to
>     a non-Unicode type code page, yet the Linux system running right
>     next to that one could have been set up to use a UTF-8 code page.

The same is true with previous FPC versions (since this is obviously 
unrelated to the compiler/RTL/language at all, and previous FPC versions 
also queried the OS to set the code page used for ansistrings if a 
widestring manager was used -- without a widestring manager, nothing 
changes either).


Jonas



More information about the fpc-pascal mailing list