<HTML>
<div><style> BODY { font-family:Arial, Helvetica, sans-serif;font-size:12px; }</style>Note that I speak mostly from an x86_64 perspective, since this is where I have almost universal exposure.<br>
</div><div><br>
</div><div>So I've been pondering a few things after researching Florian's prototype patch for optimisations done prior to register allocation, when the pre-compiled assembly language utilises imaginary (virtual) registers pretty much everywhere other than where distinct registers are required (e.g. function parameters). My question is... how much can be moved to the pre-allocation stage? I believe it can make the outcome far more efficient, especially with my initial work on the deep optimizer as a few people have stated already. Unless I'm mistaken, from what I've observed, a lot of the peephole optimizations don't actually care what registers they're playing with.<br>
</div><div><br>
</div><div>There are also situations in optimization where one might detect a repeat calculation that the programmer cannot eliminate themselves, but which the best optimization is to store the result and re-use it later... the most obvious situation where this arises is with div and mod with the same numerator and denominator. Currently, the compiler doesn't know any better and has to calculate the division twice, a relatively expensive operation, even though DIV returns both the quotient and the remainder in RAX and RDX respectively. I believe storing the mod result in a virtual register for later use will be far easier to manage when the registers have not yet been allocated, especially if it's determined that a new register has to be preserved in the function prologue, or there are no free registers at all and it has to be put on the stack, or if it's at all possible to use RDX itself as that temporary storage (the most ideal outcome in both speed and size), something that would be near impossible if RDX has been allocated for something in between.<br>
</div><div><br>
</div><div>Speaking of the deep optimizer, I do have a patch ready to submit once the backlog of other patches are analysed (it relies on a few of them), and I'll be working on porting elements of it to the pre-allocation stage. Some bits are a little difficult because of how imaginary registers are tracked... if you delete an instruction, you have to be careful you don't leave a dangling pointer in the "live range".<br>
</div><div><br>
</div><div>This is just thinking out loud and pondering.</div><div><br>
</div><div>Gareth aka. Kit<br>
</div> </HTML>