[fpc-devel] Policy on platform-specific compiler code

Sat Oct 17 00:14:20 CEST 2020

On 16/10/2020 10:47, Jonas Maebe via fpc-devel wrote:
> On 16/10/2020 10:14, J. Gareth Moreton via fpc-devel wrote:
>
>> Before I go optimising the wrong thing, I have a question to ask.
>> What's the policy on platform-specific assembly language in the
>> compiler, or any code designed to run on a specific (source) platform
>> (and using a more generic implementation otherwise via $ifdef)?  I ask
>> because I have a faster algorithm for "calc_divconst_magic_unsigned" in
>> 'compiler/cgutils.pas', but it's only able to work because it can take
>> advantage of the x86 DIV instruction using RDX:RAX (or EDX:EAX) as a
>> double-wide dividend. It is somewhat faster than what currently exists
>> because of the lack of a loop whose iteration count is proportional to
>> log2(d), where d is the desired divisor (in other words, it's slower the
>> bigger the divisor is, whereas my algorithm is constant speed).
> In general, there should be no assembly language in the compiler. Ialso
>   don't think that's worth it in this case. Unless (or maybe "even if")
> your code contains nothing but divisions by constants, I doubt this code
> has a significant effect on the total compile time.

Division by constants has a fairly frequent occurrance in code. For 
example, dividing by 10000 whenever Currency is used, and 1000 often 
appears in timing measurements (e.g. if t is in milliseconds, then t div 
1000 is seconds and t mod 1000 is the leftover milliseconds).

The existing code contains two divisions by a variable (so they can't be 
optimised) and a loop that has, at most, N iterations, where N is the 
bit size (often 32 or 64).  The loop contains only addition, subtraction 
and multiplication, and 3 branches to contend with (not including the 
repeat...until jump).  My code contains a single DIV, but also a BSR 
which is effectively used to get the base-2 logarithm of the divisor 
(also throws an internal error if the divisor is zero, since this should 
have been caught already).

Granted, you may be right and the saving won't be worth it, not to 
mention the additional complexity (and my function currently fails on 
certain divisors unexpectedly, so I'll have to do some deeper testing if 
just for my own peace of mind!) - only a lot of timing tests will 
determine that.  Nevertheless, thanks for providing the article to 
calculating the reciprocal though - that's definitely helpful in 
understanding what's going on.

Gareth aka. Kit