[fpc-devel] Unicode RTL
XHajT03 at mbox.vol.cz
Wed Nov 16 17:08:54 CET 2005
Marco van de Voort napsal(a):
>> >> >
>> >> > ... has a different implementation for utf-8 and 8-bit code pages.
>> >> Why? With utf-8 a string is searched, with 8-bit cp one char. No
>> >> char/sequence of char other than ? can generate the byte sequence
>> >> representing ?
>> > const s : 'Dani?l';
>> > var accent : utf8char;
>> > x:=pos('i','Dani?l');
>> > accent:=s[x+1];
>> We could have special support for assignment to type utf8char, couldn't
> It would be horribly slow, since this would apply to length too, and think
> while i<length(x) do inc(i); like constructs.
> I think the avg delphi code simply assumes 100% that chars are fixed
I'm afraid that you don't get too far with that assumption. "Existing
Delphi code" most probably isn't DBCS/MBCS safe.
Regarding constructs like "while i<length(x) do" - I'd say that most
common use of these are comparison, copying, translation to
uppercase/lowercase and combinations of these. All these operations should
be performed using dedicated (RTL) functions, otherwise they will fail in
DBCS/MBCS environment anyway (or at least result in suboptimal
More information about the fpc-devel