[fpc-devel] Proposal for 2 new compiler functions
J. Gareth Moreton
gareth at moreton-family.com
Wed Jun 13 20:46:40 CEST 2018
On Wed 13/06/18 20:28 , Florian Klämpfl
florian at freepascal.org sent:
> Am 12.06.2018 um 14:42 schrieb J. Gareth
Moreton:
>
> > Hi everyone,
>
> >
>
> > Sorry to flood the mailing list again
with more
> ideas and experiments.
> >
>
> > I would like to propose introducing a
new pair
> of in-built functions for the compiler.
> >
>
> > function FLog2(X: Cardinal): Cardinal;
>
> > function CLog2(X: Cardinal): Cardinal;
>
> >
>
> > FLog2 returns the base-2 logarithm of
X, rounded
> down, while CLog2 returns the base-2
logarithm of X, rounded up.
> >
>
> > To give examples where these functions
could be
> useful, FLog2(X) + 1 indicates the
minimum number of bits required to
> > store X as an unsigned integer, while
CLog2(X)
> is equal to the maximum number of
iterations required for a binary search
> > of an X-sized list.
>
> >
>
> > Why should they be in-built though?
With the
> binary search example, the size of the
list is sometimes known at compile
> > time, hence is a constant - therefore,
its
> logarithm is also a constant. By pre-
calculating the logarithm using
> the
> > in-built function, it can be used to
aid
> optimization such as loop unrolling. It
also makes them easier to
> inline,
> > where FLog2(X) on x86_64-win64
translates to a
> single line of assembly language: BSR
EAX, ECX (unless X is zero, in
> > which case ZF is set and EAX is
undefined).
>
> >
>
> > If there had to be a slight nuance
though, it's
> what happens if X = 0, since log(0) = -
oo, which cannot be stored as an
> > integer and may cause problems with
the
> compiler. I would propose it return 0
as a special case, as this will
> fix most
> > problems when it comes to loops and
storage, and
> also ensure that FLog2 and CLog2 are
"entire functions". To
> give an
> > optimised example of FLog(2) in x86-64
assembly
> language:
> >
>
> > XOR EDX, EDX
>
> > BSR EAX, ECX // ZF is set if ECX is
zero
>
> > CMOVZ EAX, EDX // Zero (EDX) written
to result
> if ZF is set.
> >
>
> > Some kind of deep optimization could
be used if
> the input is known to be non-zero and
remove the instructions either
> > side of BSR.
>
> >
>
>
>
> Here again: I would first try find a
good pascal implementation of the two
> functions, then check, what FPC
produces,
> actually, FPC has already a BsrDWord
function which is very similar to
> FLog2, though it returns 255 for 0. And
then I
> would propose to improve the optimizer
to generate code close to hand coded
> assembler.
>
>
>
__________________________________________
_____
>
> fpc-devel maillist - fpc-
devel at lists.freepascal.org
> http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
>
>
>
>
Aah, I didn't know about that function -
thank you. I have looked up algorithms for
floor(log2(x)), and the best ones I've
found use a heap of bit shifts and other
logical operations, and a few conditional
checks, I think, which is all somewhat
slower than the 10-cycle BSR instruction.
Granted, I agree there should be a good
Pascal implementation so it is portable.
For CLog2(x) the only solution I've found
currently is FLog2(x-1)+1 while returning
0 if x is 0 or 1. Mind you, an extra
addition, subtraction and a conditional
check isn't bad.
Gareth
P.S. BsrDWord is effectively FLog2, with
the 255 return value standing in for -oo.
More information about the fpc-devel
mailing list