[fpc-devel] Division nodes

Thu May 11 16:39:41 CEST 2023

It does seem odd.  In a practical sense, the only time I can see -1 
being a common input among other random numbers is if it's an error 
value, in which case you would most likely do special handling rather 
than pass it through a division operation.  With the slowdown that comes 
from additional branch prediction, it just seems like unnecessary fluff, 
but I need to double-check to see if there's a very good reason behind 
their generation (if it's a platform-specific problem, it should be 
moved to that platform's specific first pass)  Now I just need to find 
out where those nodes are generated - they're proving elusive!

Note that using constant divisors uses a different optimisation, so this 
only applies to variable divisors.

Kit

On 11/05/2023 12:07, Stefan Glienke via fpc-devel wrote:
> Looks like a rather disadvantageous way to avoid the idiv instruction because x div -1 = -x and x mod -1 = 0.
>
> I ran a quick benchmark doing a lot of integer divisions where sometimes (randomly) the divisor was -1. When the occurence was rare enough (~5%) the performance was not impacted, the higher the occurence of -1 was the slower it became to almost half as fast. Only when less than 5% of the divisors were *not* -1 the performance was better up to twice as fast when all divisors were -1. Of couse ymmv as it depends on the CPU and the branch predictor behavior but it shows that this "optimization" is hardly any good.
>
> I cannot think of a realistic case where 95% of your divisors are -1 and you really need to save those few extra cycles of calling idiv.
>
>> On 11/05/2023 11:04 CEST J. Gareth Moreton via fpc-devel <fpc-devel at lists.freepascal.org> wrote:
>>
>>   
>> Hi everyone,
>>
>> I need to ask a question about how division nodes are set up (I'm
>> looking at possible optimisation techniques).  I've written the
>> following procedure:
>>
>> procedure DoDivMod(N, D: Integer; out Q, R: Integer);
>> begin
>>     Q := N div D;
>>     R := N mod D;
>> end;
>>
>> Fairly simple and to the point.  However, even before the first node
>> pass, the following node tree is generated for an integer division
>> operation:
>>
>> <statementn pos="24,10">
>>      <ifn resultdef="$void" pos="24,10" flags="nf_internal">
>>         <condition>
>>            <equaln resultdef="Boolean" pos="24,10" flags="nf_internal">
>>               <loadn resultdef="LongInt" pos="24,14">
>>                  <symbol>D</symbol>
>>               </loadn>
>>               <ordconstn resultdef="LongInt" pos="24,10" rangecheck="FALSE">
>>                  <value>-1</value>
>>               </ordconstn>
>>            </equaln>
>>         </condition>
>>         <then>
>>            <assignn resultdef="$void" pos="24,10" flags="nf_internal">
>>               <temprefn resultdef="LongInt" pos="24,10" flags="nf_write"
>> id="$7C585E10">
>>                  <typedef>LongInt</typedef>
>> <tempflags>ti_may_be_in_reg</tempflags>
>>                  <temptype>tt_persistent</temptype>
>>               </temprefn>
>>               <unaryminusn resultdef="LongInt" pos="24,10">
>>                  <loadn resultdef="LongInt" pos="24,8">
>>                     <symbol>N</symbol>
>>                  </loadn>
>>               </unaryminusn>
>>            </assignn>
>>         </then>
>>         <else>
>>            <assignn resultdef="$void" pos="24,10" flags="nf_internal">
>>               <temprefn resultdef="LongInt" pos="24,10" flags="nf_write"
>> id="$7C585E10">
>>                  <typedef>LongInt</typedef>
>> <tempflags>ti_may_be_in_reg</tempflags>
>>                  <temptype>tt_persistent</temptype>
>>               </temprefn>
>>               <divn resultdef="LongInt" pos="24,10">
>>                  <loadn resultdef="LongInt" pos="24,8">
>>                     <symbol>N</symbol>
>>                  </loadn>
>>                  <loadn resultdef="LongInt" pos="24,14">
>>                     <symbol>D</symbol>
>>                  </loadn>
>>               </divn>
>>            </assignn>
>>         </else>
>>      </ifn>
>> </statementn>
>>
>> Something similar is made for "mod" as well.  I have to ask though... is
>> it really necessary to check to see if the divisor is -1 and have a
>> distinct assignment for it?  It's a bit of a rare edge case that usually
>> just slows things down since it tends to add a comparison and a
>> conditional jump to the final assembly language.  Is there some
>> anomalous behaviour to a processor's division routine if the divisor is -1?
>>
>> At the very least, would it be possible to remove the conditional check
>> when compiling under -Os?
>>
>> (I intend to see if it's possible to merge "N div D" and "N mod D" on
>> x86, and possibly other processors that have a combined DIV/MOD operator).
>>
>> Kit
>>
>> _______________________________________________
>> fpc-devel maillist  -  fpc-devel at lists.freepascal.org
>> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
> _______________________________________________
> fpc-devel maillist  -  fpc-devel at lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
>