[fpc-devel] vmul commutative optimization?
J. Gareth Moreton
gareth at moreton-family.com
Tue Nov 12 20:46:41 CET 2019
The Microsoft ABI is a bit restrictive when it comes to record types; as
"Structs and unions of size 8, 16, 32, or 64 bits, and __m64 types, are
passed as if they were integers of the same size." So unfortunately, a
single-precision complex number is treated as a 64-bit structure and
passed as an integer. The System V ABI, on the other hand, would pass
the two entries through the lower 64 bits of XMM0. Vectorcall,
theoretically, should put the two components into XMM0 and XMM1, because
the complex type would be considered a "homogeneous vector aggregate"
(with floats as 1-dimensional vectors).
I think the overhead that comes with issues such as this is the reason
why vectorcall was developed in the first place.
Gareth aka. Kit
On 12/11/2019 16:05, Marco van de Voort wrote:
> Op 12/11/2019 om 16:08 schreef J. Gareth Moreton:
>> It's true. With VMULSS, only the first parameter (third parameter
>> under Intel notation) can be an address (source: Intel(R) 64 and
>> IA-32 Architectures Software Development Manual, Volume 2B, Page 4-154).
>> I'll see if I can work in that optimisation for the commutative
>> operations (+ and *) at some point from the node side.
> Another tidbit I noticed while playing with (elements of) the complex
> patch is that if I set the elementsize to double (re:double;im:double)
> that with vectorcall loads all data into registers.
> However if I make it single, (iow the tcomplex is 8-byte), the records
> are loaded into integer registers, and the compiler stores them to the
> stack and then reloads them.
> This matters less for me since it won't vectorize anyway (see inline
> and philosophy thread) I'll change this routine to assembler I think,
> accepting a pointer and load and store from that thread.
> fpc-devel maillist - fpc-devel at lists.freepascal.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the fpc-devel