[fpc-devel] ARM: AND/CMP -> TST optimisation produces incorrect results

J. Gareth Moreton gareth at moreton-family.com
Wed Feb 28 16:14:53 CET 2024


Hi Garry,

Hopefully I have fixed this issue now, which is also causing problems 
elsewhere.

https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/598 - just 
waiting on it to be verified, approved and merged.

Gareth aka. Kit

On 20/02/2024 06:32, J. Gareth Moreton via fpc-devel wrote:
>
> Thanks for the report and especially your investigative work. Ii'll 
> take a look to see what's going on.
>
> Gareth aka. Kit
>
> On 20/02/2024 01:30, Garry Wood via fpc-devel wrote:
>>
>> Hello,
>>
>> Commit 6b2e4fa4 (main) entitled “* arm: "OpCmp2OpS" moved to Pass 2 
>> so it doesn't conflict with AND; CMP -> TST optimisation” by Gareth 
>> from Feb 11 2024 produces incorrect assembler in certain cases.
>>
>> https://gitlab.com/freepascal.org/fpc/source/-/commit/6b2e4fa4133a496c1c3f89e3c71fffbdd7c192fb
>>
>> This piece of code:
>>
>> function CPUMaskCount(CPUMask:LongWord):LongWord;
>>
>> var
>>
>> Count:LongWord;
>>
>> begin
>>
>> {}
>>
>> Result:=0;
>>
>>  for Count:=CPU_ID_0 to CPU_ID_MAX do
>>
>>   begin
>>
>>    if (CPUMask and (1 shl Count)) <> 0 then
>>
>>     begin
>>
>>      Inc(Result);
>>
>>     end;
>>
>>   end;
>>
>> end;
>>
>> when compiled with FPC prior to commit 6b2e4fa4 produces the 
>> following working assembler:
>>
>> 00020528 <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD>:
>>
>>    20528: e1a01000            mov       r1, r0
>>
>> 2052c:               e3a00000            mov       r0, #0
>>
>>    20530: e3a02000            mov       r2, #0
>>
>>    20534: e3a03001            mov       r3, #1
>>
>>    20538: e0113213           ands      r3, r1, r3, lsl r2
>>
>> 2053c:               12800001           addne   r0, r0, #1
>>
>>    20540: e2822001           add        r2, r2, #1
>>
>>    20544: e352001f            cmp       r2, #31
>>
>>    20548: 9afffff9 bls          20534 
>> <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD+0xc>
>>
>> 2054c:               e12fff1e               bx lr
>>
>> But when compiled with FPC after commit 6b2e4fa4 it produces this 
>> assembler which doesn’t work:
>>
>> 00020528 <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD>:
>>
>>    20528: e1a01000            mov       r1, r0
>>
>> 2052c:               e3a00000            mov       r0, #0
>>
>>    20530: e3a02000            mov       r2, #0
>>
>>    20534: e3a03001            mov       r3, #1
>>
>>    20538: e1110003           tst           r1, r3
>>
>> 2053c:               12800001           addne   r0, r0, #1
>>
>>    20540: e2822001           add        r2, r2, #1
>>
>>    20544: e352001f            cmp       r2, #31
>>
>>    20548: 9afffff9 bls          20534 
>> <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD+0xc>
>>
>> 2054c:               e12fff1e               bx lr
>>
>> You can see that the difference is the lack of lsl r2 on the end of 
>> the TST instruction which means that the shl on the original code is 
>> not being performed and the test is therefore invalid.
>>
>> Similar code sequences in multiple other places produce the same 
>> result with the lsl suffix missing from the TST instruction.
>>
>> Please let me know if you need any further information.
>>
>> Garry Wood.
>>
>>
>> _______________________________________________
>> fpc-devel maillist  -fpc-devel at lists.freepascal.org
>> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
>
> _______________________________________________
> fpc-devel maillist  -fpc-devel at lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20240228/9cd06dad/attachment.htm>


More information about the fpc-devel mailing list