[fpc-devel] Optimization theory
Florian Klämpfl
florian at freepascal.org
Sun Jun 17 10:56:23 CEST 2018
Am 16.06.2018 um 23:21 schrieb J. Gareth Moreton:
> Note that I speak mostly from an x86_64 perspective, since this is where I have almost universal exposure.
>
> So I've been pondering a few things after researching Florian's prototype patch for optimisations done prior to register
> allocation, when the pre-compiled assembly language utilises imaginary (virtual) registers pretty much everywhere other
> than where distinct registers are required (e.g. function parameters). My question is... how much can be moved to the
> pre-allocation stage?
A lot, basically everything which reduced register pressure. The only problem is, at this stage, the code contains a lot
of moves (compile with -sr to see how it looks like). So the optimizer must be able to handle this. It might be even
possible to build a generic optimizer pass at this stage. Example:
A typical sequence FPC often generates is:
mov %src1,%dest1
add %dest1,%src2,%dest2
If src1 is no released after mov but dest1 is release, src1 and dest1 still cannot be coalesced as they interfere, so an
extra register is allocated. The move will be remove by the peephole optimizer, but register was allocated and increase
register pressure. Such optimizations could be done generic (for all CPUs): if the destination of a mov is only read
afterwards (this information is already generically available), the mov can be removed and in this case dest1 can be
replaced by src1.
More information about the fpc-devel
mailing list