[fpc-devel] Optimization for 'mod'
J. Gareth Moreton
gareth at moreton-family.com
Fri Jan 5 16:05:17 CET 2018
I've used DivMod myself before, akin to something like this under Windows:
{$ASMMODE Intel}
function DivMod(N, D: QWord; var R: QWord): QWord; assembler; nostackframe;
asm
MOV R9, RDX
MOV RAX, RCX
XOR RDX, RDX { Zero RDX because the DIV operation uses RDX:RAX as the numerator }
DIV R9 { Quotient is in RAX and remainder is in RDX }
MOV [R8], RDX
end;
The drawback is the overhead of calling a procedure, and the fact it uses DIV, which can take upwards of 50
cycles to execute for 64-bit operands. It's a very slow command, hence the 9-nanosecond metric that I
measured compared to times under 2 nanoseconds if converted into a multiplication. True, you could write a
Pascal variant and inline the function to avoid the expensive procedure call, but you're still using DIV.
If you know the divisor is a constant, then you can change the division into a multiplication by calculating
a reciprocal value (along with an addition and bit shift to correct rounding errors), which is far faster to
execute. Also, most programmers won't think to use something like DivMod, or may not want to for sake of
convenience, especially with something like the following:
WriteLn(Format('Time taken: %d:%.02d', [Seconds div 60, Seconds mod 60]));
Kit
More information about the fpc-devel
mailing list