[fpc-devel] Curious about the effect of all the new optimizations....
lazarus at mfriebe.de
Wed Mar 1 14:11:28 CET 2023
On 01/03/2023 12:25, J. Gareth Moreton via fpc-devel wrote:
> My peephole optimisations mostly save only a handful of cycles each
> time which probably won't add up to much for a relatively short test.
> The most major optimisation I can think of, although I'm not quite
> sure when it was merged, is the method of replacing divisions by a
> constant with an equivalent reciprocal multiplication. You'll see the
> biggest savings there. There's other difficulties like processors
> being intelligent with caching and out of order execution, for
> example, that are disguising some inefficiencies. And some seek only
> to reduce code size with no loss of speed.
> What are your timings like when compiling with COREAVX or COREAVX2? A
> couple of recent peephole optimizations make use of BMI1 and BMI2.
I had -CpCOREAVX2 supplied. (my fpc is a good week old, so if recent is
less than that...)
I don't have many divisions in that code.
Most of the good is going through big data in memory. So its possible
that any gained processing speed, just has to wait for data to be fetched.
> I can't remember the proverb that Florian used, but it essentially
> boils down to very small changes, individually not amounting to much,
> but which accumulate into a noticable difference when in large numbers.
Hence testing back to 3.2.3 ( unfortunately 3.2.2 has a bug that
matters in this code)
Also, I didn't expect any huge diffs, just wanted to see if anything can
be noted at all. (and if lucky, in that test I run)
I did a test on a more limited scope (testing only a handful of
functions. That test runs 4 Min 20 sec under 3.2.3.
And 2 extra seconds with 3.3.1. But then I only had 2 sample runs for
each fpc version....
More information about the fpc-devel