[fpc-devel] Kit's ambitions!
David Pethes
public at satd.sk
Mon Jun 11 21:27:16 CEST 2018
Hi,
nice work.
On 8. 6. 2018 0:46, J. Gareth Moreton wrote:
> The deep optimiser changes this to:
>
> movq %rcx,%rax
> movq %rdx,%rsi
> movq %rcx,%rbx
>
> It determines, for the third MOV, it can
> change %rax for %rcx to minimise a
> pipeline stall, and then knows that %rbx
> and %rcx contain the same value, so can
> remove the 4th MOV completely. Given that
> modern processors usually have at least 3
> ALUs and the interdependencies have been
> removed, this will likely give a speed
> increase of one cycle over these few
> commands.
Note that modern cpu-s can use move elimination for reg to reg moves, so
it doesn't cost any execution resources (it's "free"). Despite that it's
still a win, because it spares both bytes in I-cache and decoder
bandwidth (which can indirectly lead to some spared cycle(s) at other
places).
David
More information about the fpc-devel
mailing list