[fpc-devel] Question to developers

Fri Sep 17 18:44:25 CEST 2021

Personally, I like the idea of having as much optimization as possible, 
but I'm likely in the minority here.  Since your original submission 
covered the (probably) most often used case, I'd say at this point, it's 
probably fine the way it is.

On the other hand, depending on timing, (I.E. how long does it take for 
the extra optimizations, and what kind of benefits would accrue), it 
might be worth persuing, but if as you suspect, it's rarely used, and 
the optimization takes long enough, that it would overshadow any gains 
in it's being skipped, (I'm talking compile time, not runtime) then it 
may not be worth continuing.  As I said, personally, I'd love as many 
optimizations as possible.  I'm personally of the opinion that software 
is entirely too bloated and includes way too much junk in it these days, 
which is why I use powerbasic for my windows development whenever I have 
a windows only project.  I use FPC if I have a cross platform project, 
especially if I can convince the end users/financial folks that java 
isn't necessary.  I've been a big fan of pascal every since I first 
learned it back in 1988, and I'd use it for everything if powerbasic 
wasn't so darned convenient, especially for it's extremely small 
executables, and it's ability to operate with a bare minimum of dlls.

So, anything that gets freepascal closer to matching powerbasic in 
executable size/speed is fine with me.

On 9/17/2021 11:55 AM, J. Gareth Moreton via fpc-devel wrote:
> Hi everyone,
>
> I have a question for third-party developers, especially mathematical, 
> scientific and games programmers...
>
> How often would you use a construct akin to "case x mod n of", where n 
> is a constant and not a power of 2?  I ask because while I wait for my 
> "(x mod n) = 0" optimisation to be approved, I've discovered that it 
> could be extended to cover remainders other than 0.  For example, 
> starting with (x mod 3) = 0 (unsigned 32-bit using Intel notation):
>
> MOV EAX, x
> IMUL EAX, $AAAAAAAB
> CMP EAX, $55555555
> JBE @Remainder0
>
> For (x mod 3) = 1:
>
> MOV EAX, x
> IMUL EAX, $AAAAAAAB
> SUB EAX, $AAAAAAAB
> CMP EAX, $55555554
> JBE @Remainder1
>
> Which in this case, can be simplified to:
>
> MOV EAX, x
> IMUL EAX, $AAAAAAAB
> CMP EAX, $AAAAAAAB
> JAE @Remainder1
>
> To keep with the simple case of n and 2^32 being relatively prime 
> (i.e. n is odd), this is due to the fact that for a given divisor n, 
> you can calculate its multiplicative inverse r such that nr = 1 (mod 
> 2^32).  Then if x is a multiple of n, it can be written as kn for some 
> integer k, and (kn)r = k (mod 2^32), and because n and 2^32 are 
> relatively prime, the domain has an exact one-to-one mapping onto the 
> range, so the output k cannot be produced by any other input after 
> being multiplied by r, and k falls in the range 0 to floor(2^32 / n), 
> which is 0 to $55555555 for n = 3.
>
> For remainder 1, x has the form kn + 1, and when multipled by r, the 
> result is congruent to k + r (mod 2^32).  Because of the 1-to-1 
> mapping, the ranges of k and k + r will not overlap, and in the case 
> for a divisor of 3 and a remainder of 1, maps onto the linear range 
> $AAAAAAAB (k = 0) to $FFFFFFFF (k = $55555554).  For completion, for 
> divisor 3 and a remainder of 2, the range is $55555556 to $AAAAAAAA 
> (note that 2r = $55555556 (mod 2^32)).
>
> I still got a couple of nuances to figure out, such as what to do with 
> even values of n and the exact length of each range (for n = 3, k can 
> range from 0 to $55555555 for remainder 0, but only 0 to $55555554 for 
> remainders 1 and 2... this is because $FFFFFFFF is an exact multiple 
> of 3).
>
> So long story short, would optimisations for "(x mod n) = c" and "case 
> (x mod n) of" be welcome, or is it overkill and "(x mod n) = 0" more 
> than enough?
>
> Gareth aka. Kit
>
>