[fpc-devel] Kit's ambitions!

David Pethes public at satd.sk
Mon Jun 11 21:27:16 CEST 2018


Hi,
nice work.

On 8. 6. 2018 0:46, J. Gareth Moreton wrote:

> The deep optimiser changes this to:
> 
> movq %rcx,%rax
> movq %rdx,%rsi
> movq %rcx,%rbx
> 
> It determines, for the third MOV, it can 
> change %rax for %rcx to minimise a 
> pipeline stall, and then knows that %rbx 
> and %rcx contain the same value, so can 
> remove the 4th MOV completely. Given that 
> modern processors usually have at least 3 
> ALUs and the interdependencies have been 
> removed, this will likely give a speed 
> increase of one cycle over these few 
> commands.

Note that modern cpu-s can use move elimination for reg to reg moves, so
it doesn't cost any execution resources (it's "free"). Despite that it's
still a win, because it spares both bytes in I-cache and decoder
bandwidth (which can indirectly lead to some spared cycle(s) at other
places).

David



More information about the fpc-devel mailing list