[fpc-devel] Proposal for 2 new compiler functions

J. Gareth Moreton gareth at moreton-family.com
Wed Jun 13 20:46:40 CEST 2018



On Wed 13/06/18 20:28 , Florian Klämpfl 
florian at freepascal.org sent:
> Am 12.06.2018 um 14:42 schrieb J. Gareth 
Moreton:
> 
> > Hi everyone,
> 
> > 
> 
> > Sorry to flood the mailing list again 
with more
> ideas and experiments.
> > 
> 
> > I would like to propose introducing a 
new pair
> of in-built functions for the compiler.
> > 
> 
> > function FLog2(X: Cardinal): Cardinal;
> 
> > function CLog2(X: Cardinal): Cardinal;
> 
> > 
> 
> > FLog2 returns the base-2 logarithm of 
X, rounded
> down, while CLog2 returns the base-2 
logarithm of X, rounded up.
> > 
> 
> > To give examples where these functions 
could be
> useful, FLog2(X) + 1 indicates the 
minimum number of bits required to 
> > store X as an unsigned integer, while 
CLog2(X)
> is equal to the maximum number of 
iterations required for a binary search 
> > of an X-sized list.
> 
> > 
> 
> > Why should they be in-built though? 
With the
> binary search example, the size of the 
list is sometimes known at compile 
> > time, hence is a constant - therefore, 
its
> logarithm is also a constant.  By pre-
calculating the logarithm using
> the 
> > in-built function, it can be used to 
aid
> optimization such as loop unrolling.  It 
also makes them easier to
> inline, 
> > where FLog2(X) on x86_64-win64 
translates to a
> single line of assembly language: BSR 
EAX, ECX (unless X is zero, in 
> > which case ZF is set and EAX is 
undefined).
> 
> > 
> 
> > If there had to be a slight nuance 
though, it's
> what happens if X = 0, since log(0) = -
oo, which cannot be stored as an 
> > integer and may cause problems with 
the
> compiler.  I would propose it return 0 
as a special case, as this will
> fix most 
> > problems when it comes to loops and 
storage, and
> also ensure that FLog2 and CLog2 are 
"entire functions".  To
> give an 
> > optimised example of FLog(2) in x86-64 
assembly
> language:
> > 
> 
> > XOR EDX, EDX
> 
> > BSR EAX, ECX // ZF is set if ECX is 
zero
> 
> > CMOVZ EAX, EDX // Zero (EDX) written 
to result
> if ZF is set.
> > 
> 
> > Some kind of deep optimization could 
be used if
> the input is known to be non-zero and 
remove the instructions either 
> > side of BSR.
> 
> > 
> 
> 
> 
> Here again: I would first try find a 
good pascal implementation of the two
> functions, then check, what FPC 
produces, 
> actually, FPC has already a BsrDWord 
function which is very similar to
> FLog2, though it returns 255 for 0. And 
then I 
> would propose to improve the optimizer 
to generate code close to hand coded
> assembler.
> 
> 
> 
__________________________________________
_____
> 
> fpc-devel maillist  -  fpc-
devel at lists.freepascal.org
> http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
> 
> 
> 
>

Aah, I didn't know about that function - 
thank you. I have looked up algorithms for 
floor(log2(x)), and the best ones I've 
found use a heap of bit shifts and other 
logical operations, and a few conditional 
checks, I think, which is all somewhat 
slower than the 10-cycle BSR instruction. 
Granted, I agree there should be a good 
Pascal implementation so it is portable.

For CLog2(x) the only solution I've found 
currently is FLog2(x-1)+1 while returning 
0 if x is 0 or 1. Mind you, an extra 
addition, subtraction and a conditional 
check isn't bad.

Gareth

P.S. BsrDWord is effectively FLog2, with 
the 255 return value standing in for -oo.



More information about the fpc-devel mailing list