[fpc-devel] Optimisation and memory alignment question
J. Gareth Moreton
gareth at moreton-family.com
Sun Feb 28 11:11:09 CET 2021
Hi everyone,
So to get to the point, I've spotted another potential peephole
optimisation specifically on x86_64:
movq (%rdx),%rax
shrq $32,%rax
Is it acceptable to change this to the following?
movl 4(%rdx),%eax
Logically it's equivalent thanks to the guarantee that the upper 32-bits
of the destination register will be zeroed, but I know sometimes there
might be a penalty for reading from memory that isn't aligned to a
16-byte boundary, say.
A "movl; shrl $16" version may be possible with movzx, but I'm not
certain if that will be even more inefficient due to the offset now
being 2 rather than 4.
Gareth aka. Kit
--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
More information about the fpc-devel
mailing list