[fpc-devel] UTF-8 string literals

Michael Van Canneyt michael at freepascal.org
Fri May 5 15:55:32 CEST 2017



On Fri, 5 May 2017, Mattias Gaertner wrote:

> On Fri, 5 May 2017 14:30:32 +0200 (CEST)
> Michael Van Canneyt <michael at freepascal.org> wrote:
>
>> [...]
>> > AFAIK FPC stores UTF-8 string literals (-Fcutf8) as widestrings
>> > instead of UTF8String. Please correct me if I'm wrong.

To make sure I was presenting correct facts, I did some tests.

As a result of the tests, I think the above statement is wrong.

{$codepage utf8}

var
   p : pchar;

begin
   P:=Pchar('some string literal');
end.

Results in the following assembler:

.globl  _$PROGRAM$_Ld1
_$PROGRAM$_Ld1:
         .ascii  "some string literal\000"
.Le11:

Not widestring as far as I can see ?

To be sure, I added some russian characters:

.Ld1:
         .ascii  "some string literal \320\272\320\270\321\202\320\260"
         .ascii  "\320\271\321\201\320\272\320\276\320\263\320\276\000"

Again, not widestring ?

home: >cat u.pp
{$codepage utf8}
var
   p : pchar;

begin
   P:=Pchar('some string literal китайского');
end.

So, I tried a resourcestring:


.Ld3$strlab:
         .short  65001,1
         .long   0
         .quad   -1,30
.Ld3:
         .ascii  "some more \320\272\320\270\321\202\320\260\320\271\321"
         .ascii  "\201\320\272\320\276\320\263\320\276\000"

Again, no widestring, as far as I can see.

Michael.


More information about the fpc-devel mailing list