[fpc-devel] Optimization for 'mod'

J. Gareth Moreton gareth at moreton-family.com
Sat Jan 6 17:00:10 CET 2018


Hi Ched,

That optimisation actually already exists, 
as does changing unsigned division by a 
power of two into SHR.

Another optimisation I put in... if you're 
doing x mod a, where a is greater than 
half the maximum value for the word size 
(e.g. $80000000 for 32-bit), then it can 
be changed into what effectively amounts 
to a comparison and a subtraction, 
although I used CMOV to avoid branching.

It also seems that Florian was happy with 
the patch, albeit with a couple of changes 
regarding coding style and an issue with 
register allocation. So a happy ending, I 
think!

Kit

On Fri 05/01/18 10:16 , Ched 
charles.edouard.des.vastes.vignes at gmail.co
m sent:
> Hi,
> 
> 
> 
> I haven't done the check if already 
implemented, but there are some
> interesting optimisations possible like:
> if a is a non-negative integer and t is 
a non-negative integer power of
> two, then
> a mod t
> 
> can be translated at compile time to
> 
> a and (t-1)
> 
> 
> 
> Cheers, Ched'
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Le 05.01.2018 à 04:32, J. Gareth Moreton 
a écrit :
> 
> > Hi everyone,
> 
> > 
> 
> > During my exploration of FPC, I 
noticed that the
> 'mod' operator is not optimized in the 
same way 'div' is.
> > Though not deeply tested on other 
platforms yet,
> I have developed a patch for 3.1.1 that 
implements the
> > algorithm that calculates a reciprocal 
constant
> for unsigned 'mod' operations with a 
constant denominator.
> > Just my little contribution to the 
compiler - I
> hope it is sound.
> > 
> 
> > https://bugs.freepascal.org/view.php?
id=32945
> > 
> 
> > I attached a couple of test projects 
that
> contain numerous 32-bit constants 
(testfile2.pp) and 64-bit
> > constants (testfile3.pp) and a means 
to output
> timing metrics.  A sample of such is 
shown in the bug report.
> > 
> 
> > Note that I unfortunately made a 
slight mistake
> that I only noticed after uploading the 
patch. The line:
> > 
> 
> > m := qword(-aint(d)); { Two's 
complement of d }
> 
> > 
> 
> > ...should be...
> 
> > 
> 
> > m := aword(-aint(d)); { Two's 
complement of d }
> 
> > 
> 
> > 
> 
> > NOTE: One of the optimised algorithms,
> specifically for 32-bit divisors greater 
than $80000000, makes use of
> > CMOV to avoid jumps.  If CMOV is not 
available
> (it checks "CPUX86_HAS_CMOV in
> >
> 
cpu_capabilities[current_settings.cputype]
"), then the code will fall
> back to the default 'mod'
> > implementation that was already 
present.
> 
> > 
> 
> > I should note that there's still a lot 
of room
> for improvement.  For example, code such 
as this:
> > 
> 
> > Minutes := t div 60;
> 
> > Seconds := t mod 60;
> 
> > 
> 
> > ... is badly optimised, because the 
division is
> calculated twice (without the patch, 't 
div 60' uses a
> > reciprocal constant, while 't mod 60' 
just uses
> 'DIV'), whereas, ideally, it should only 
be calculated once
> > since the numerator and denominator 
are the
> same... either a single call to DIV, or 
calculating the quotient
> > via the reciprocal multiplication 
method, then
> obtaining the remainder by multiplying 
the quotient by the
> > original denominator (60 in this case) 
and
> subtracting the product from the 
original numerator.
> > 
> 
> > Gareth aka. Kit
> 
> 
__________________________________________
_____
> 
> fpc-devel maillist  -  fpc-
devel at lists.freepascal.org
> http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
> 
> 
> 
> 




More information about the fpc-devel mailing list