[fpc-pascal] FPC 3 regression: cannot use TStringList for UTF-8 data any more?

Michael Schnell mschnell at lumino.de
Mon Apr 18 12:09:17 CEST 2016


On 04/16/2016 10:47 AM, Mattias Gaertner wrote:
> That's correct. String literals in a codepage other than system are 
> stored as UTF-16 in the binary 

(Assuming with "other than system" you mean different from the 
DefaultSystemcodepage setting the compiler sees at it's runtime).

I see. And of course that will work.

Even though IMHO this is rather hard to understand, as UTF 16 seems to 
be a rather ineffective coding of constants (for software generally 
working with UTF-8) regarding as well storage as conversion effort.

Seemingly the compiler assumes that the executable likely at runtime 
will see DefaultSystemcodepage not equal the  {$codepage setting in the 
source code and abstains form trying to prepare the possible 
optimization the user might have intended by using {$codepage .

> and converted on assign. 

By "on assign" you mean "when a constant string is assigned to a string 
that in it's type has a defined codepage (CP_ACP or any other 7 or 8 bit 
encoding, or CO_UTF16BE)", but not if it's CP_NONE or CP_UTF16).


> The conversion happens at runtime, so the string codepage is decided 
> at runtime.
I'll re-check if I see the convertor working...

-Michael



More information about the fpc-pascal mailing list