[fpc-devel] Curious about the effect of all the new optimizations....

Martin Frb lazarus at mfriebe.de
Wed Mar 1 14:11:28 CET 2023

On 01/03/2023 12:25, J. Gareth Moreton via fpc-devel wrote:
> My peephole optimisations mostly save only a handful of cycles each 
> time which probably won't add up to much for a relatively short test.  
> The most major optimisation I can think of, although I'm not quite 
> sure when it was merged, is the method of replacing divisions by a 
> constant with an equivalent reciprocal multiplication.  You'll see the 
> biggest savings there.  There's other difficulties like processors 
> being intelligent with caching and out of order execution, for 
> example, that are disguising some inefficiencies.  And some seek only 
> to reduce code size with no loss of speed.
> What are your timings like when compiling with COREAVX or COREAVX2?  A 
> couple of recent peephole optimizations make use of BMI1 and BMI2.
I had -CpCOREAVX2 supplied. (my fpc is a good week old, so if recent is 
less than that...)
I don't have many divisions in that code.

Most of the good is going through big data in memory. So its possible 
that any gained processing speed, just has to wait for data to be fetched.

> I can't remember the proverb that Florian used, but it essentially 
> boils down to very small changes, individually not amounting to much, 
> but which accumulate into a noticable difference when in large numbers.
Hence testing back to  3.2.3 ( unfortunately 3.2.2 has a bug that 
matters in this code)

Also, I didn't expect any huge diffs, just wanted to see if anything can 
be noted at all. (and if lucky, in that test I run)

I did a test on a more limited scope (testing only a handful of 
functions. That test runs 4 Min 20 sec under 3.2.3.
And 2 extra seconds with 3.3.1.  But then I only had 2 sample runs for 
each fpc version....

More information about the fpc-devel mailing list