[fpc-devel] Peephole optimisation progress
J. Gareth Moreton
gareth at moreton-family.com
Sun Jan 19 02:11:12 CET 2020
Hi everyone,
So I'm still focused on x86 for the moment, and I'm still looking for
ways to both increase the speed of the compiler and also find new
optimisations. Currently I'm building a few more principles of my "Deep
Optimizer" into OptPass1MOV that are showing promise - I should have a
patch ready tomorrow, and if approved, I can build a few extra things on
top of it, as well as removing some other optimisations that have become
redundant as a result.
In regards to things that are ready, I discovered an extra pass that
occurs after the Post-Peephole Optimization stage that is a little
hidden (PostPeepHoleOpts is overridden and calls OptReferences afte the
'inherited' call). What this pass does is optimise all of the
references in the instructions to take on a standardised form. Because
of the nature of the Post-Peephole Optimization stage (generally only
converting individual instructions into more compact forms), it is very
easy to optimise the current instruction's references as part of this
pass, thereby removing the need to have a separate OptReferences pass.
Details can be found here: https://bugs.freepascal.org/view.php?id=36583
- initial experiments show a 10% speed increase.
One other thing that has been on my mind for a while, but would need
some discussion... I would like to move Pass 1 so it runs before all of
the imaginary registers are changed to real registers. Some
optimisations are able to reduce the number of registers required by a
routine, but since this occurs after the register allocation, overhead
such as preserving and restoring non-volatile registers has already been
generated. Additionally, additional optimisations can be programmed to
help the allocator - for example, if you come across "mov %immreg1,
%immreg2", %immreg1 gets deallocated at this point (i.e. is not used
afterwards) and it's the first appearance of %immreg2, all references to
%immreg2 after that point can be changed to %immreg1 and the mov
instruction removed - if that's too much work, but the allocator can be
trusted to give %immreg1 and %immreg2 the same "colour" in such
instances, then the extraneous mov instruction would have to be removed
in Pass 2, say. The only difficulty is working out how to track the
register usage, since TAllUsedRegs can only handle real registers. I
figure a new descendant class might be the answer to that one.
Something new to research!
Gareth aka. Kit
More information about the fpc-devel
mailing list