[fpc-devel] Producing assembly with less branches?
Stefan Glienke
sglienke at dsharp.org
Sun Jul 19 23:37:10 CEST 2020
----- Reply to message -----
> Subject: [fpc-devel] Producing assembly with less branches?
> From: Stefan Glienke <sglienke at dsharp.org>
> To: <fpc-devel at lists.freepascal.org>
>> Hi,
>> not sure if anything significantly changed in trunk compared to 3.2 wrt
>> to optimized code being generated but I am quite disappointed that fpc
>> (checked win64 with -O3 and -O4) does not use cmovxx instructions and
>> alike for the most basic things and produces terrible code like this:
>> unit1.pas:49 if left < right then
>> 000000010002E3C0 39ca cmp edx,ecx
>> 000000010002E3C2 7e06 jle 0x10002e3ca <COMPAREINT+10>
>> unit1.pas:50 Result := -1
>> 000000010002E3C4 b8ffffffff mov eax,0xffffffff
>> 000000010002E3C9 c3 ret
>> unit1.pas:51 else if left > right then
>> 000000010002E3CA 39ca cmp edx,ecx
>> 000000010002E3CC 7d06 jge 0x10002e3d4 <COMPAREINT+20>
>> unit1.pas:52 Result := 1
>> 000000010002E3CE b801000000c3 mov eax,0x1
>> unit1.pas:54 Result := 0;
>> 000000010002E3D4 31c0 xor eax,eax
>> unit1.pas:55 end;
>> 000000010002E3D6 c3 ret
>> Similar for even simpler things:
>> unit1.pas:43 if i < 0 then
>> 000000010002E3A1 85c0 test eax,eax
>> 000000010002E3A3 7d03 jge 0x10002e3a8
>> <BUTTON1CLICK+72>
>> unit1.pas:44 i := 0;
>> 000000010002E3A5 31c0 xor eax,eax
>> 000000010002E3A7 90 nop
>> Imo someone should work at that and make the compiler produce less
>> branches. Not sure if that is on your list but it should be looked at.
> it's already done in trunk (sadly not in 3.2.0)
> to get cmov instruction emitted, has to meet two conditions
> 1) if statement without else part
> 2) assign value of variable (not constant).
>
> your code has to look like to benefit from cmov
>
> function cmov2(left, right : longint):longint;
> var l1,lf: longint;
> r : longint;
> begin
> l1:=1;
> lf:=-1;
> r:=0;
> if left > right then
> begin
> r:=lf;
> end;// else
> if left < right then
> begin
> r:=l1;
> end;// else r:=0;
> cmov2:=r;
> end;
>
>
> 00400370 b9 01 00 00 00 mov ecx,00000001h
> 00400375 ba ff ff ff ff mov edx,0ffffffffh
> 0040037a 31 c0 xor eax,eax
> 0040037c 39 fe cmp esi,edi
> 0040037e 0f 4c c2 cmovl eax,edx
> 00400381 39 fe cmp esi,edi
> 00400383 0f 4f c1 cmovnle eax,ecx
> 00400386 c3 ret
Still kinda disappointing compared to what it could be - while this is
some simple code a modern compiler should try to eliminate conditional
jumps even with the incredibly powerful branch predictors nowadays.
clang and gcc emit this - I would guess they detect quite some common
patterns like this.
xor ecx, ecx
cmp eax, edx
mov eax, -1
setg cl
cmovge eax, ecx
ret
cmp eax, edx
mov edx, -1
setg al
movzx eax, al
cmovl eax, edx
ret
--
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus
More information about the fpc-devel
mailing list