[fpc-devel] ARM: AND/CMP -> TST optimisation produces incorrect results
Garry Wood
garry at softoz.com.au
Tue Feb 20 02:30:58 CET 2024
Hello,
Commit 6b2e4fa4 (main) entitled "* arm: "OpCmp2OpS" moved to Pass 2 so it doesn't conflict with AND; CMP -> TST optimisation" by Gareth from Feb 11 2024 produces incorrect assembler in certain cases.
https://gitlab.com/freepascal.org/fpc/source/-/commit/6b2e4fa4133a496c1c3f89e3c71fffbdd7c192fb
This piece of code:
function CPUMaskCount(CPUMask:LongWord):LongWord;
var
Count:LongWord;
begin
{}
Result:=0;
for Count:=CPU_ID_0 to CPU_ID_MAX do
begin
if (CPUMask and (1 shl Count)) <> 0 then
begin
Inc(Result);
end;
end;
end;
when compiled with FPC prior to commit 6b2e4fa4 produces the following working assembler:
00020528 <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD>:
20528: e1a01000 mov r1, r0
2052c: e3a00000 mov r0, #0
20530: e3a02000 mov r2, #0
20534: e3a03001 mov r3, #1
20538: e0113213 ands r3, r1, r3, lsl r2
2053c: 12800001 addne r0, r0, #1
20540: e2822001 add r2, r2, #1
20544: e352001f cmp r2, #31
20548: 9afffff9 bls 20534 <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD+0xc>
2054c: e12fff1e bx lr
But when compiled with FPC after commit 6b2e4fa4 it produces this assembler which doesn't work:
00020528 <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD>:
20528: e1a01000 mov r1, r0
2052c: e3a00000 mov r0, #0
20530: e3a02000 mov r2, #0
20534: e3a03001 mov r3, #1
20538: e1110003 tst r1, r3
2053c: 12800001 addne r0, r0, #1
20540: e2822001 add r2, r2, #1
20544: e352001f cmp r2, #31
20548: 9afffff9 bls 20534 <GLOBALCONFIG_$$_CPUMASKCOUNT$LONGWORD$$LONGWORD+0xc>
2054c: e12fff1e bx lr
You can see that the difference is the lack of lsl r2 on the end of the TST instruction which means that the shl on the original code is not being performed and the test is therefore invalid.
Similar code sequences in multiple other places produce the same result with the lsl suffix missing from the TST instruction.
Please let me know if you need any further information.
Garry Wood.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20240220/d1c4c0d3/attachment.htm>
More information about the fpc-devel
mailing list