[fpc-devel] Unicode in the RTL (my ideas)

Hans-Peter Diettrich DrDiettrich1 at aol.com
Wed Aug 22 14:47:58 CEST 2012


Ivanko B schrieb:
>> Do you mean replacing a character in an UCS-2/UCS-4 string can be
>> implemented more efficiently than in an UTF-8/UTF-16 string?
>>
> 
> Sure, just scan the string char by char as array elements and replace
> as matches encounter. Like working with integer arrays.

This applies only to UCS4/UTF-32. In all other cases the overall byte 
size of both characters may vary, due to escape sequences/surrogate 
pairs. Ligatures also should be considered, so that every simplified 
approach risks to be buggy. At least the size of both "characters" 
should be compared, and a StringReplace should be used when both differ. 
But the same applies to StringReplace as well, where substrings of the 
same size can be replaced in-place :-)

DoDi




More information about the fpc-devel mailing list