[fpc-devel] Unicode in the RTL (my ideas)

Mattias Gaertner nc-gaertnma at netcologne.de
Wed Aug 22 08:59:59 CEST 2012


On Wed, 22 Aug 2012 09:34:33 +0500
Ivanko B <ivankob4mse2 at gmail.com> wrote:

> > Do you mean replacing a character in an UCS-2/UCS-4 string can be
> > implemented more efficiently than in an UTF-8/UTF-16 string?
> >
> 
> Sure, just scan the string char by char as array elements and replace
> as matches encounter. Like working with integer arrays.

Just some notes:
Often you need to replace ASCII characters like new lines, spaces or
semicolon. These can be replaced in UTF-8/UTF-16 as easily.

If you want to replace non ASCII characters for example to normalize
diacritical characters then even in UCS-2/UCS-4 you have to replace
several codepoints with one.

UCS-2 does not matter for the RTL, which must work with the full
Unicode range. And UCS-4 is a waste of space for big texts.

How many functions have you written that replaces
characters in an UTF-8/UTF-16 string with different size characters?

Mattias



More information about the fpc-devel mailing list