[fpc-devel] x86_64 question

Nikolay Nikolov nickysn at gmail.com
Fri Oct 2 15:13:42 CEST 2020


On 10/2/20 2:13 PM, J. Gareth Moreton via fpc-devel wrote:
> Confirmed my suspicions.  if I zero the upper bits of the register (I 
> used something akin to "AND RCX, $F"), there is no speed loss.
>
> Therefore, I can make the hypothesis, on my Intel(R) Core(TM) 
> i7-10750H, that using TEST on a sub-register causes a false dependency 
> if the bits outside of the subset are not zero, even though the 
> register isn't being modified.

If you send me a test program, I can run it on my Ryzen 5 2500U to see 
how AMD behaves. We don't specifically optimize for AMD (yet), but it's 
interesting to know.

Nikolay

>
> Gareth aka. Kit
>
> On 02/10/2020 11:57, J. Gareth Moreton via fpc-devel wrote:
>> So... I've done some tests, replacing TEST RCX, $4 with TEST CL, $4 
>> and the like in a number-crunching function, and it seems to cause a 
>> notable penalty, even though none of the instructions are in my 
>> critical loop.  So I think it's something that needs to be avoided in 
>> most cases.  I think the reason why it worked in my Int and Frac 
>> functions is because the processor knows the upper 48 bits of the 
>> register are zero.
>>
>> Long story short... best not to do it unless you have some additional 
>> insight into what the registers contain.
>>
>> Gareth aka. Kit
>>
>>


More information about the fpc-devel mailing list