[fpc-devel] x86_64 question
Nikolay Nikolov
nickysn at gmail.com
Fri Oct 2 15:13:42 CEST 2020
On 10/2/20 2:13 PM, J. Gareth Moreton via fpc-devel wrote:
> Confirmed my suspicions. if I zero the upper bits of the register (I
> used something akin to "AND RCX, $F"), there is no speed loss.
>
> Therefore, I can make the hypothesis, on my Intel(R) Core(TM)
> i7-10750H, that using TEST on a sub-register causes a false dependency
> if the bits outside of the subset are not zero, even though the
> register isn't being modified.
If you send me a test program, I can run it on my Ryzen 5 2500U to see
how AMD behaves. We don't specifically optimize for AMD (yet), but it's
interesting to know.
Nikolay
>
> Gareth aka. Kit
>
> On 02/10/2020 11:57, J. Gareth Moreton via fpc-devel wrote:
>> So... I've done some tests, replacing TEST RCX, $4 with TEST CL, $4
>> and the like in a number-crunching function, and it seems to cause a
>> notable penalty, even though none of the instructions are in my
>> critical loop. So I think it's something that needs to be avoided in
>> most cases. I think the reason why it worked in my Int and Frac
>> functions is because the processor knows the upper 48 bits of the
>> register are zero.
>>
>> Long story short... best not to do it unless you have some additional
>> insight into what the registers contain.
>>
>> Gareth aka. Kit
>>
>>
More information about the fpc-devel
mailing list