[fpc-devel] Unicode support (yet again)

Hans-Peter Diettrich DrDiettrich1 at aol.com
Sun Sep 18 15:07:07 CEST 2011


Luiz Americo Pereira Camara schrieb:
> On 17/9/2011 11:46, Hans-Peter Diettrich wrote:
>> Luiz Americo Pereira Camara schrieb:
>>
>>> The codepage of a RawByteString at runtime will keep the previous 
>>> CodePage (65001 for UTF8, 1200 for UTF16) as opposed to change to the 
>>> RawbyteString CodePage (65535) as a though previously
>>
>> Delphi defines RawByteString=AnsiString, so there is no room for 
>> UTF-16 in such an string.
> 
> No. I was wrong. See Florian email. RawByteString will keep the codepage 
> (1200 = UTF16) and the data of the assigned string be UTF8, be UTF8.
> 
>>
>>> So the implementation would be:
>>>
>>> function FileGetAttr(const FileName: RawByteString): Longint;
>>> begin
>>> SetCodePage(FileName, 1200, True);
>>
>> Won't work, because of "const",
> 
> Yes
> 
>> and because UTF-16 is not a Byte (AnsiChar) string :-(
> 
> No. See above. Look in net for Delphi and Unicode doc by marco cantu

Can you give me a link? I checked the XE documentation and RTL, and 
could not find that RawByteString can hold UTF-16, and my test confirms 
that:

var
   a: AnsiString;
   u: UnicodeString;

procedure test(r: RawByteString; cp: word);
begin
   WriteLn('in:  ', StringElementSize(r), ' cp: ', StringCodePage(r), ' 
len=', length(r));
   WriteLn('"', r, '"'); //writes garbage for non-OEM chars, of course
   SetCodePage(r, cp, true);
   WriteLn('out: ', StringElementSize(r), ' cp: ', StringCodePage(r), ' 
len=', length(r));
   a := r; //use the result, so that nothing can be optimized away
   WriteLn('"', r, '"');
end;

This reveals the following behaviour:

1) UnicodeString is converted to AnsiString, before passed to test.
2) Setting codepage to 1200 doesn't change anything.
3) Conversion to UTF-8 seems to work (length changed).
4) Conversion from UTF-8 to Ansi results in an empty string.

I'll ask in an Embarcadero group, in detail for [4].

DoDi




More information about the fpc-devel mailing list