[fpc-devel] Division nodes
Stefan Glienke
sglienke at dsharp.org
Thu May 11 13:07:57 CEST 2023
Looks like a rather disadvantageous way to avoid the idiv instruction because x div -1 = -x and x mod -1 = 0.
I ran a quick benchmark doing a lot of integer divisions where sometimes (randomly) the divisor was -1. When the occurence was rare enough (~5%) the performance was not impacted, the higher the occurence of -1 was the slower it became to almost half as fast. Only when less than 5% of the divisors were *not* -1 the performance was better up to twice as fast when all divisors were -1. Of couse ymmv as it depends on the CPU and the branch predictor behavior but it shows that this "optimization" is hardly any good.
I cannot think of a realistic case where 95% of your divisors are -1 and you really need to save those few extra cycles of calling idiv.
> On 11/05/2023 11:04 CEST J. Gareth Moreton via fpc-devel <fpc-devel at lists.freepascal.org> wrote:
>
>
> Hi everyone,
>
> I need to ask a question about how division nodes are set up (I'm
> looking at possible optimisation techniques). I've written the
> following procedure:
>
> procedure DoDivMod(N, D: Integer; out Q, R: Integer);
> begin
> Q := N div D;
> R := N mod D;
> end;
>
> Fairly simple and to the point. However, even before the first node
> pass, the following node tree is generated for an integer division
> operation:
>
> <statementn pos="24,10">
> <ifn resultdef="$void" pos="24,10" flags="nf_internal">
> <condition>
> <equaln resultdef="Boolean" pos="24,10" flags="nf_internal">
> <loadn resultdef="LongInt" pos="24,14">
> <symbol>D</symbol>
> </loadn>
> <ordconstn resultdef="LongInt" pos="24,10" rangecheck="FALSE">
> <value>-1</value>
> </ordconstn>
> </equaln>
> </condition>
> <then>
> <assignn resultdef="$void" pos="24,10" flags="nf_internal">
> <temprefn resultdef="LongInt" pos="24,10" flags="nf_write"
> id="$7C585E10">
> <typedef>LongInt</typedef>
> <tempflags>ti_may_be_in_reg</tempflags>
> <temptype>tt_persistent</temptype>
> </temprefn>
> <unaryminusn resultdef="LongInt" pos="24,10">
> <loadn resultdef="LongInt" pos="24,8">
> <symbol>N</symbol>
> </loadn>
> </unaryminusn>
> </assignn>
> </then>
> <else>
> <assignn resultdef="$void" pos="24,10" flags="nf_internal">
> <temprefn resultdef="LongInt" pos="24,10" flags="nf_write"
> id="$7C585E10">
> <typedef>LongInt</typedef>
> <tempflags>ti_may_be_in_reg</tempflags>
> <temptype>tt_persistent</temptype>
> </temprefn>
> <divn resultdef="LongInt" pos="24,10">
> <loadn resultdef="LongInt" pos="24,8">
> <symbol>N</symbol>
> </loadn>
> <loadn resultdef="LongInt" pos="24,14">
> <symbol>D</symbol>
> </loadn>
> </divn>
> </assignn>
> </else>
> </ifn>
> </statementn>
>
> Something similar is made for "mod" as well. I have to ask though... is
> it really necessary to check to see if the divisor is -1 and have a
> distinct assignment for it? It's a bit of a rare edge case that usually
> just slows things down since it tends to add a comparison and a
> conditional jump to the final assembly language. Is there some
> anomalous behaviour to a processor's division routine if the divisor is -1?
>
> At the very least, would it be possible to remove the conditional check
> when compiling under -Os?
>
> (I intend to see if it's possible to merge "N div D" and "N mod D" on
> x86, and possibly other processors that have a combined DIV/MOD operator).
>
> Kit
>
> _______________________________________________
> fpc-devel maillist - fpc-devel at lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
More information about the fpc-devel
mailing list