[fpc-devel] cpstrrtl/unicode branch merged to trunk

Sven Barth pascaldragon at googlemail.com
Sat Sep 7 14:52:52 CEST 2013


On 06.09.2013 22:48, Hans-Peter Diettrich wrote:
> Sven Barth schrieb:
>> Am 06.09.2013 14:16 schrieb "Hans-Peter Diettrich"
>> <DrDiettrich1 at aol.com <mailto:DrDiettrich1 at aol.com>>:
>>  > I'm not sure how efficient a RawByteString version ever can be. By
>> default it has to convert the string into Unicode (Delphi: UTF-16),
>> and the result back to CP_ACP. In these cases it looks more efficient
>> to call the Unicode version immediately, and leave *eventual* further
>> conversions to the compiler. Some routines may implement common
>> processing of true SBCS, but I'm not sure how many these are.
>>
>> Not every RTL will use a 16-bit API. On Windows the RawByteString
>> variant might be slower, but on Linux it will be faster as long as the
>> string passed in will be encoded in the system encoding (mostly UTF-8).
>
> Then CP_ACP on Linux will be UTF-8, and automatic conversion can use
> UTF-8 instead of UTF-16.
>
> I consider both UTF-8 and UTF-16 "Unicode", so that it's debatable
> whether then also the UnicodeString type will become UTF-8, for even
> less conversions. But when the user decides that he wants UTF-16
> UnicodeStrings, for simplified handling of BMP text...

UnicodeString is not UTF-8.

In case of Linux + String=UnicodeString the call sequence will look like 
UnicodeString function => RawByteString function => OS function.
In caes of Linux + String=AnsiString the call sequence will look like
RawByteString function => OS function
In case of Windows + String=UnicodeString the call sequence will look 
like UnicodeString function => OS function
In case of Windows + String=AnsiString the call sequence will look like 
RawByteString function => UnicodeString function => OS function

Regards,
Sven



More information about the fpc-devel mailing list