[fpc-devel] String and UnicodeString and UTF8String
Marco van de Voort
marcov at stack.nl
Wed Jan 12 07:45:47 CET 2011
In our previous episode, Jeff Wormsley said:
> > encoding of full Unicode. Ansi and UCS2 (really UTF-16) only *look*
> > easier to handle in user code, but both will fail and require special
> > code whenever characters outside the assumed codepage may occur.
>
> Preface: I don't write international apps, and probably won't for the
> foreseeable future...
>
> Isn't all of this concentration on trying to make strings have single
> byte characters (who cares how they are encoded), using the argument
> that it is somehow faster, incorrect for just about any modern
> processor, including embedded CPU's such as ARM?
> It was my
> understanding that 32 bit aligned access was always faster than byte
> aligned access on just about any CPU FPC still supports.
1-byte access is always 1-byte aligned, and the memory system is still
slower than these kind of issues. And you shuffle a lot of zeroes extra
around.
But the trouble is also that 2-byte situation doesn't really solve anything,
(you still have surrogates and it never will be as simple as it was), and a
much bigger problem with legacy (how many two byte data do you get daily,
and how much 1 byte?)
> The argument holds just fine for memory, but I don't really get the
> speed argument. Maybe I'm missing something.
Shoveling twice as much memory around IS the speed argument :-)
More information about the fpc-devel
mailing list