[fpc-devel] CompareMem slower in FPC 2.4.4
Michalis Kamburelis
michalis.kambi at gmail.com
Wed Jun 1 22:07:18 CEST 2011
Hi,
In my tests, FPC 2.4.4 has much slower CompareMem than FPC 2.4.2, at
least for some cases:
# with fpc 2.4.4
$ time ./compare_mem_test 100000000
real 0m7.795s
user 0m7.764s
sys 0m0.008s
# with fpc 2.4.2
$ time ./compare_mem_test 100000000
real 0m1.218s
user 0m1.216s
sys 0m0.000s
This uses CompareMem to compare vectors 3 x Single (12 bytes). As you
can see, FPC 2.4.2 is > 6 times faster, so it's a noticeable slowdown
(especially if this is a bottleneck of your algorithm :). This is on
Linux, i386 (32 bit).
I'm attaching the sample compare_mem_test.lpr, just compile by "fpc
compare_mem_test.lpr".
I did some tests. In FPC 2.4.4 CompareMem equals to CompareByte. I tried
using CompareWord, which is ~2 times faster, and CompareDWord, which is
even better, but still --- even the best (CompareDWord) on FPC 2.4.4 is
~3 times slower than CompareMem on FPC 2.4.2.
Funny thing is, the fastest approach turned out to be the simplest one:
compare by "(V1[0] = V2[0]) and (V1[1] = V2[1]) and (V1[2] = V2[2])".
This is *much* faster than every other approach. (And yes, I tested that
it wasn't optimized out :)
Any thoughts? Maybe something can be improved?
1. Why CompareMem got slower in FPC 2.4.4? Maybe something can be fixed?
2. The simple comparison "(V1[0] = V2[0]) and..." is much faster than
any CompareXxx. Any chance of improving it? In this case, size is known
at compile time, so maybe CompareXxx could be "magic" and (for
reasonably small sizes) the compiler could just generate a proper code
to compare them just like "=" operator? Just an idea of course, I don't
know how easy it would be to actually implement.
Michalis
More information about the fpc-devel
mailing list