[fpc-devel] Is calling the Windows Unicode APIs really faster than the ANSI API's?

Fri Sep 26 09:12:01 CEST 2008

Op Fri, 26 Sep 2008, schreef Graeme Geldenhuys:

> On Thu, Sep 25, 2008 at 10:33 PM, Florian Klaempfl
> <florian at freepascal.org> wrote:
>>
>> Who says that? UTF-16 is simply chosen because it has features (supporting
>> all characters basically) ANSI doesn't?
>
> Sorry, my message was unclear and I got somewhat mixed up between ANSI
> and UTF-8. I meant the encoding type of String or UnicodeString being
> UTF-16 instead of UTF-8.  The CodeGear newsgroups are full of people
> saying that UTF-16 was chosen because they could call the 'W' api's
> without needing a conversion.
>
> My question is, has anybody actually seen the speed difference (actual
> timing results) showing UTF-16 string calling 'W' api's compared to
> UTF-8->UTF-16 and then calling the 'W' api's.  With today's computers,
> I can't imagine that there would be a "significant speed loss" using
> such conversions. The speed difference might be milliseconds, but
> that's not really "significant speed loss" is it?

I think the main speed issue with UTF-8 is the speed of procedures like 
"val". A "val" which accepts both western and Arabic digits would be 
significantly more complex and therefore slower in UTF-8 than in UTF-16.

> I suppose it would be viable doing timing results for saving text
> files as well. After all, 99% of the time, text files are stored in
> UTF-8. So in D2009 you would first have to convert UTF-16 to UTF-8 and
> then save. And the opposite when reading, plus checking for the byte
> order marker.  If you used UTF-8 for the String encoding no
> conversions are required and no byte order marker checks needed.

For me the speed of input/output is less relevant, this is limited by disk 
speed anyway. It's the speed of processing that should be decisive.

Daniël