<HTML>
<style> BODY { font-family:Arial, Helvetica, sans-serif;font-size:12px; }</style>Just wrote some Pascal functions for ClampInt to test it:<br>
<br>
function ClampInt1(X: LongInt): LongInt; inline;<br>
begin<br>
if (X < 0) then<br>
Result := 0<br>
else<br>
Result := X;<br>
end;<br>
<br>
function ClampInt2(X: LongInt): LongInt; inline;<br>
begin<br>
Result := 0;<br>
if (X > 0) then Result := X;<br>
end;<br>
<br>
<br>
The first one produces relatively convoluted assembly language when calling "I := ClampInt1(I);"<br>
<br>
0000000100001634 89f8 mov %edi,%eax <-- %edi contains I<br>
0000000100001636 85c0 test %eax,%eax<br>
0000000100001638 7d06 jge 0x100001640 <main+112><br>
000000010000163A 31d2 xor %edx,%edx<br>
000000010000163C eb04 jmp 0x100001642 <main+114><br>
000000010000163E 6690 xchg %ax,%ax <-- This appears to be for alignment and is never actually called<br>
0000000100001640 89c2 mov %eax,%edx<br>
0000000100001642 89d7 mov %edx,%edi<br>
<br>
Second one from calling "I := ClampInt2(I);"<br>
<br>
00000001000016CA 89f8 mov %edi,%eax<br>
00000001000016CC 31d2 xor %edx,%edx<br>
00000001000016CE 85c0 test %eax,%eax<br>
00000001000016D0 0f4fd0 cmovg %eax,%edx<br>
00000001000016D3 89d7 mov %edx,%edi<br>
<br>
<div>Middle three instructions match, more or less! The peephole optimizer, or some kind of deep optimizer that uses data-flow analysis, could do a better job though and remove the use of %eax completely, just using %edi. In this instance, it's a good example of trying to do things in Pascal first. Despite all that, the three instructions in isolation are fairly well-optimised because the three instructions will take around 2 to 3 cycles to execute, given that XOR and TEST will execute simultaneously because they are independent instructions and can use two different ALUs, then CMOVG will usually take 1 cycle if the condition is false, and 2 if it's true.</div><br>
In one of my supplied test programs, I had the following code for ClampInt:<br>
<br>
function ClampInt(X: LongInt): LongInt; assembler; nostackframe; inline;<br>
asm<br>
<div> CMP ECX, 0<br>
MOV EAX, 0<br>
CMOVG EAX, ECX<br>
</div>end;<br>
<br>
Slightly larger code size, but should be just as fast, but the purpose of it was to test the peephole optimizer because I realised during development that the FLAGS register wasn't marked as used when the code was inlined, and the peephole optimizer changed MOV EAX, 0 to XOR EAX, EAX and thus ruins the CMOVG instruction. CMP ECX, 0 is changed to TEST ECX, ECX without incident, even after I fixed the issue with the FLAGS usage.<br>
<br>
A better function to showcase might be the "tertiary" function... equivalent to the "b ? x : y" notation in C:<br>
<br>
function IIf(Condition: Boolean; TrueVal, FalseVal: LongInt): LongInt; assembler; nostackframe; inline;<br>
asm<br>
TEST CL, CL<br>
MOV EAX, EDX<br>
CMOVE EAX, R8D<br>
end;<br>
<br>
Once again, there's probably a Pascal version that's just as fast, but the idea with this one is that it ties into the peephole optimizer and, eventually, can change triplets like "SETL CL; TEST CL, CL; CMOVE EAX, R8D" (with "SETL CL" coming from a Boolean expression in the actual parameter) to just CMOVNL EAX, R8D appearing after MOV EAX, EDX..<br>
<br>
Gareth aka. Kit<br>
<br>
<br>
<span style="font-weight: bold;">On Sun 17/03/19 22:47 , "Marģers ." margers.roked@inbox.lv sent:<br>
</span><blockquote style="BORDER-LEFT: #F5F5F5 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px; PADDING-LEFT: 5px; PADDING-RIGHT: 0px">
<br>
<br>
----- Reply to message -----
<br>
Subject: Re: [fpc-devel] Successful implementation
<br>
of inline supportforpureassembler routines on x86
<br>
Date: 2019. gada 18. marts 00:28:10
<br>
From: J. Gareth Moreton <<a href="javascript:top.opencompose('gareth@moreton-family.com','','','')">gareth@moreton-family.com</a>>
<br>
To: FPC developers' list
<br>
<<a href="javascript:top.opencompose('fpc-devel@lists.freepascal.org','','','')">fpc-devel@lists.freepascal.org</a>>
<br>
<br>
<span style="color: rgb(102, 102, 102);">> To use the integer clamp function as an
</span><br>
example (if x < 0 then x := 0):
<br>
<br>
<span style="color: rgb(102, 102, 102);">> { Microsoft x64 calling convention... X is in ECX }
</span><br>
<span style="color: rgb(102, 102, 102);">> function ClampInt(X: LongInt): LongInt;
</span><br>
assembler; nostackframe; inline;
<br>
<span style="color: rgb(102, 102, 102);">> asm
</span><br>
<span style="color: rgb(102, 102, 102);">> XOR EAX, EAX
</span><br>
<span style="color: rgb(102, 102, 102);">> TEST ECX, ECX
</span><br>
<span style="color: rgb(102, 102, 102);">> CMOVG EAX, ECX
</span><br>
<span style="color: rgb(102, 102, 102);">> end;
</span><br>
<br>
try code:
<br>
y:=0;
<br>
if x < 0 then x:=y;
<br>
<br>
<br>
<br>
</blockquote></HTML>