[fpc-devel] Is calling the Windows Unicode APIs really faster than the ANSI API's?

Fri Sep 26 17:02:30 CEST 2008

On Fri, 26 Sep 2008 13:20:57 +0200
Michael Schnell <mschnell at lumino.de> wrote:

> Nonetheless a type to hold a single character needs to exist. And
> same needs to be a 32 bit type if you want to store more than 2^16
> different values (as possible with UTF-8 and UTF-16 but not with
> UCS-2.

Some characters are encoded as several unicode characters. For example
a german a-umlaut is encoded under Mac OS X HFS as 2 characters =
1+2bytes in UTF-8 and 2+2bytes in UTF-16. This is not some Egyptian or
Klingon, but normal German, Finnish, French, etc. A
s[i]:='x' doesn't work in UTF-8, nor UTF-16, nor UTF-32.

In short:
A single character for all purposes can not be defined. Unicode can not
be handled as array of character.

The choice for UTF-8 or UTF-16 depends mostly on the used libraries
and compatibility. The more unicode features you want to support the
less important becomes the encoding.

The encoding can be important for speed:
For example the widestring xml parser is up to 10 times slower than
the ansistring xml parser.

Mattias