[fpc-devel] Unicode in the RTL (my ideas)
mse00000 at gmail.com
Tue Aug 21 10:17:56 CEST 2012
Am 21.08.2012 09:55, schrieb Graeme Geldenhuys:
> On 21 August 2012 07:10, Ivanko B<ivankob4mse2 at gmail.com> wrote:
>> How about supporting in the RTL all versions of UCS-2& UTF-16 (for
>> fast per-char access etc optimizations) and UTF-8 (for unlimited
>> number of alphabets) ?
> All "access a char by index into a string" code I have seen, 99.99% of
> the time work in a sequential manner. For that reason there is no
> speed difference between using a UTF-16 or UTF-8 encoded string. Both
> can be coded equally efficient.
Graeme, this is simply not true. Searching for known German characters
in a UnicodeString the program can use the simple approach by character
(code unit) index. It is even possible for known Chinese symbols of the
BMP. And a simple "if" for surrogate pairs is more efficent as a 4-stage
"case" for utf-8.
More information about the fpc-devel