[fpc-devel] Unicode support (yet again)
Luiz Americo Pereira Camara
luizmed at oi.com.br
Sun Sep 18 16:17:44 CEST 2011
On 18/9/2011 10:07, Hans-Peter Diettrich wrote:
> Luiz Americo Pereira Camara schrieb:
>> On 17/9/2011 11:46, Hans-Peter Diettrich wrote:
>>> Luiz Americo Pereira Camara schrieb:
>>>
>>>> The codepage of a RawByteString at runtime will keep the previous
>>>> CodePage (65001 for UTF8, 1200 for UTF16) as opposed to change to
>>>> the RawbyteString CodePage (65535) as a though previously
>>>
>>> Delphi defines RawByteString=AnsiString, so there is no room for
>>> UTF-16 in such an string.
>>
>> No. I was wrong. See Florian email. RawByteString will keep the
>> codepage (1200 = UTF16) and the data of the assigned string be UTF8,
>> be UTF8.
>>
>>>
>>>> So the implementation would be:
>>>>
>>>> function FileGetAttr(const FileName: RawByteString): Longint;
>>>> begin
>>>> SetCodePage(FileName, 1200, True);
>>>
>>> Won't work, because of "const",
>>
>> Yes
>>
>>> and because UTF-16 is not a Byte (AnsiChar) string :-(
>>
>> No. See above. Look in net for Delphi and Unicode doc by marco cantu
>
> Can you give me a link? I checked the XE documentation and RTL, and
> could not find that RawByteString can hold UTF-16, and my test
> confirms that:
>
http://edn.embarcadero.com/article/38980
You may read also:
http://www.micro-isv.asia/2008/08/using-rawbytestring-effectively/
> var
> a: AnsiString;
> u: UnicodeString;
>
> procedure test(r: RawByteString; cp: word);
> begin
> WriteLn('in: ', StringElementSize(r), ' cp: ', StringCodePage(r), '
> len=', length(r));
> WriteLn('"', r, '"'); //writes garbage for non-OEM chars, of course
> SetCodePage(r, cp, true);
> WriteLn('out: ', StringElementSize(r), ' cp: ', StringCodePage(r), '
> len=', length(r));
> a := r; //use the result, so that nothing can be optimized away
> WriteLn('"', r, '"');
> end;
>
> This reveals the following behaviour:
>
> 1) UnicodeString is converted to AnsiString, before passed to test.
> 2) Setting codepage to 1200 doesn't change anything.
> 3) Conversion to UTF-8 seems to work (length changed).
> 4) Conversion from UTF-8 to Ansi results in an empty string.
>
> I'll ask in an Embarcadero group, in detail for [4].
Are you using Delphi XE or fpc?
I dont have Delphi XE. What i know is from that docs and these discussions
Luiz
More information about the fpc-devel
mailing list