[fpc-devel] vmul commutative optimization?
J. Gareth Moreton
gareth at moreton-family.com
Tue Nov 12 16:08:47 CET 2019
It's true. With VMULSS, only the first parameter (third parameter under
Intel notation) can be an address (source: Intel(R) 64 and IA-32
Architectures Software Development Manual, Volume 2B, Page 4-154).
I'll see if I can work in that optimisation for the commutative
operations (+ and *) at some point from the node side.
Gareth aka. Kit
On 12/11/2019 12:22, Marco van de Voort wrote:
> I compiled some bits with avx, and noticed that when you do
> then that generates something like
> vmovss TC_$FFTS_$$_C31(%rip),%xmm2
> vmulss %xmm0,%xmm2,%xmm0
> while if you do
> it generates
> vmulss TC_$FFTS_$$_C32(%rip),%xmm2,%xmm2
> I assume the reason is that only the first param can be an address,
> and the second a register. But the compiler isn't smart enough to
> exchange them.
> fpc-devel maillist - fpc-devel at lists.freepascal.org
More information about the fpc-devel