[fpc-devel] Optimization of redundant mov's

Martok listbox at martoks-place.de
Sat Mar 18 23:32:55 CET 2017


Hi all,

there has been some discussion about FPCs optimizer in #31444, prompting me to
investigate some of my own code. Generally speaking the generated assembler is
not all that bad (I like how it uses LEA for almost all integer arithmetics),
but I keep seeing sections with lots of redundant MOVs.

Example, from a SHA512 implementation:
CurrentHash is a field of the current class, compiled with anything above -O2,
-CpCOREAVX2, -Px86_64.

 a:= CurrentHash[0]; b:= CurrentHash[1]; c:= CurrentHash[2]; d:= CurrentHash[3];
0000000100074943 488b8424a0020000         mov    0x2a0(%rsp),%rax
000000010007494B 4c8b5038                 mov    0x38(%rax),%r10
000000010007494F 488b8424a0020000         mov    0x2a0(%rsp),%rax
0000000100074957 4c8b5840                 mov    0x40(%rax),%r11
000000010007495B 488b9424a0020000         mov    0x2a0(%rsp),%rdx
0000000100074963 488b4248                 mov    0x48(%rdx),%rax
0000000100074967 488b9424a0020000         mov    0x2a0(%rsp),%rdx
000000010007496F 488b6a50                 mov    0x50(%rdx),%rbp

Every single one of the "mov 0x2a0(%rsp), %rxx" instructions except the first is
redundant and causes another memory round-trip. At the same time, more registers
are used, which probably makes other optimizations more difficult, especially
when something similar happens on i386.

Now, the fun part: I haven't been able to build a simple test that causes the
same issue (the self-pointer already is in %rcx and not fetched from the stack
each time), so I have a feeling this may be a side effect of some other part of
the code.

Does this sound familiar to anyone? If so, what could I do about it?


Regards,

Martok




More information about the fpc-devel mailing list