[fpc-devel] Performance of string handling in trunk
Jonas Maebe
jonas.maebe at elis.ugent.be
Thu Jun 27 15:33:12 CEST 2013
On 27 Jun 2013, at 03:54, luiz americo pereira camara wrote:
> 2013/6/21 Sergei Gorelkin <sergei_gorelkin at mail.ru>:
>
>> I've profiled the code and found no conversions taking place. All the
>> slowdown appears to be caused by other reasons, hard to tell the
>> topmost
>> contributor. What catches the eye is the large amount of calls to
>> UniqueString, and the fact that SetCodePage goes through implicit
>> try..finally block even if it does not need to convert the string.
>
> Seems that Florian changed SetCodePage to avoid implicit try finally.
>
> It improved the performance slightly but still a lot slower than
> 2.6.X .
The speed is virtually identical for me between FPC 2.6.x and trunk on
Mac OS X/PPC. Of course, it's an incomplete test program and there is
no information about how it is compiled. Additionally, the timings in
the last post in that thread are completely different from the ones in
the first post, so they seem to come from a different test program or
a different system (which should be specified, comparing arbitrary
numbers is useless if you are interested in finding out what the cause
is).
There was a small difference if the program was either compiled with -
Fcxxx or contains a {$codepage xxx} directive. That results in the
constant strings in the program to get that particular code page
rather than CP_ACP in 2.7.1. In this case the test program became 10%
slower. The reason was that new ansistrings created in the rtl (e.g.
for char to ansistring, or concatenating ansistrings) get CP_ACP by
default, and changing this afterwards to the custom code page caused
going through InternalSetCodePage() with its exception frame setup/
tear down. If solved that in r24985.
Under Linux/i386 I however do still see a significant slowdown
(regardless of the used code page). Strangely enough, it goes away for
me if the system unit is compiled with -O2 -Oonostackframe instead of
with -O2. I don't know why. It might be some ugly cache conflict, but
for such a small program and little data that is unlikely to be the
case on modern x86 caches. It might also be code alignment, but some
playing with -Oaloop and -Oaproc doesn't seem to change it either.
Jonas
More information about the fpc-devel
mailing list