[fpc-devel] Peephole optimisation progress

J. Gareth Moreton gareth at moreton-family.com
Sun Jan 19 02:11:12 CET 2020


Hi everyone,

So I'm still focused on x86 for the moment, and I'm still looking for 
ways to both increase the speed of the compiler and also find new 
optimisations. Currently I'm building a few more principles of my "Deep 
Optimizer" into OptPass1MOV that are showing promise - I should have a 
patch ready tomorrow, and if approved, I can build a few extra things on 
top of it, as well as removing some other optimisations that have become 
redundant as a result.

In regards to things that are ready, I discovered an extra pass that 
occurs after the Post-Peephole Optimization stage that is a little 
hidden (PostPeepHoleOpts is overridden and calls OptReferences afte the 
'inherited' call).  What this pass does is optimise all of the 
references in the instructions to take on a standardised form.  Because 
of the nature of the Post-Peephole Optimization stage (generally only 
converting individual instructions into more compact forms), it is very 
easy to optimise the current instruction's references as part of this 
pass, thereby removing the need to have a separate OptReferences pass.  
Details can be found here: https://bugs.freepascal.org/view.php?id=36583 
- initial experiments show a 10% speed increase.

One other thing that has been on my mind for a while, but would need 
some discussion... I would like to move Pass 1 so it runs before all of 
the imaginary registers are changed to real registers.  Some 
optimisations are able to reduce the number of registers required by a 
routine, but since this occurs after the register allocation, overhead 
such as preserving and restoring non-volatile registers has already been 
generated.  Additionally, additional optimisations can be programmed to 
help the allocator - for example, if you come across "mov %immreg1, 
%immreg2", %immreg1 gets deallocated at this point (i.e. is not used 
afterwards) and it's the first appearance of %immreg2, all references to 
%immreg2 after that point can be changed to %immreg1 and the mov 
instruction removed - if that's too much work, but the allocator can be 
trusted to give %immreg1 and %immreg2 the same "colour" in such 
instances, then the extraneous mov instruction would have to be removed 
in Pass 2, say.  The only difficulty is working out how to track the 
register usage, since TAllUsedRegs can only handle real registers.  I 
figure a new descendant class might be the answer to that one.  
Something new to research!

Gareth aka. Kit



More information about the fpc-devel mailing list