[fpc-devel] AArch64 Register efficiency
J. Gareth Moreton
gareth at moreton-family.com
Thu Aug 20 14:18:44 CEST 2020
During my evaluation of the assembly language produced by the AArch64 implementation of the Free Pascal Compiler, I've noticed that it uses the stack an awful lot and, generally, not many of the 28 or so general-
purpose registers available for it.
The main problem is that even though x9 to x15 are designed to be used for local variables (caller-saved), the compiler doesn't make use of them because it has to write them to the stack anyway whenever it calls
another subroutine, while x0-x7 and x19-x28 are not used because they're considered volatile and may not retain their values after a subroutine call. Nevertheless, leaf functions tend to perform a little better when
it comes to register use.
It's gotten me to thinking... for leaf functions (maybe even all functions if research shows it's plausible), after it has been assigned registers (and maybe gone through peephole optimisation), would it be feasible
to store a list of used registers with the associated object that the compiler has for said function? That way, when it is called by another routine, the peephole optimizer and the register allocator can see which
registers are not used, so it is more likely to use a volatile register rather than the stack to save temporary values. It is also potentially cross-platform.
The only current sticking points I can see for this system are routines that contain blocks of inline assembly language, forcing a line-by-line evaluation of such blocks to see what's in use, and some additional
overhead in the register allocator, which might necessitate only enabling the feature under -O3 or -O4 (I'm tempted to say -O4 because of the potential risk of side-effects, even though any such effects are either
bugs or due to the programmer going out of their way to trick the compiler with some dangerous assembly language, e.g. literal bytes that represent direct machine code). Thoughts?
Gareth aka. Kit
More information about the fpc-devel