[fpc-devel] CompareMem slower in FPC 2.4.4

Michalis Kamburelis michalis.kambi at gmail.com
Wed Jun 1 22:07:18 CEST 2011


Hi,

In my tests, FPC 2.4.4 has much slower CompareMem than FPC 2.4.2, at
least for some cases:

# with fpc 2.4.4
$ time ./compare_mem_test 100000000
real	0m7.795s
user	0m7.764s
sys	0m0.008s
# with fpc 2.4.2
$ time ./compare_mem_test 100000000
real	0m1.218s
user	0m1.216s
sys	0m0.000s

This uses CompareMem to compare vectors 3 x Single (12 bytes). As you
can see, FPC 2.4.2 is > 6 times faster, so it's a noticeable slowdown
(especially if this is a bottleneck of your algorithm :). This is on
Linux, i386 (32 bit).

I'm attaching the sample compare_mem_test.lpr, just compile by "fpc
compare_mem_test.lpr".

I did some tests. In FPC 2.4.4 CompareMem equals to CompareByte. I tried
using CompareWord, which is ~2 times faster, and CompareDWord, which is
even better, but still --- even the best (CompareDWord) on FPC 2.4.4 is
~3 times slower than CompareMem on FPC 2.4.2.

Funny thing is, the fastest approach turned out to be the simplest one:
compare by "(V1[0] = V2[0]) and (V1[1] = V2[1]) and (V1[2] = V2[2])".
This is *much* faster than every other approach. (And yes, I tested that
it wasn't optimized out :)

Any thoughts? Maybe something can be improved?

1. Why CompareMem got slower in FPC 2.4.4? Maybe something can be fixed?

2. The simple comparison "(V1[0] = V2[0]) and..." is much faster than
any CompareXxx. Any chance of improving it? In this case, size is known
at compile time, so maybe CompareXxx could be "magic" and (for
reasonably small sizes) the compiler could just generate a proper code
to compare them just like "=" operator? Just an idea of course, I don't
know how easy it would be to actually implement.

Michalis



More information about the fpc-devel mailing list