[fpc-devel] Proposal for 2 new compiler functions
florian at freepascal.org
Wed Jun 13 21:28:44 CEST 2018
Am 12.06.2018 um 14:42 schrieb J. Gareth Moreton:
> Hi everyone,
> Sorry to flood the mailing list again with more ideas and experiments.
> I would like to propose introducing a new pair of in-built functions for the compiler.
> function FLog2(X: Cardinal): Cardinal;
> function CLog2(X: Cardinal): Cardinal;
> FLog2 returns the base-2 logarithm of X, rounded down, while CLog2 returns the base-2 logarithm of X, rounded up.
> To give examples where these functions could be useful, FLog2(X) + 1 indicates the minimum number of bits required to
> store X as an unsigned integer, while CLog2(X) is equal to the maximum number of iterations required for a binary search
> of an X-sized list.
> Why should they be in-built though? With the binary search example, the size of the list is sometimes known at compile
> time, hence is a constant - therefore, its logarithm is also a constant. By pre-calculating the logarithm using the
> in-built function, it can be used to aid optimization such as loop unrolling. It also makes them easier to inline,
> where FLog2(X) on x86_64-win64 translates to a single line of assembly language: BSR EAX, ECX (unless X is zero, in
> which case ZF is set and EAX is undefined).
> If there had to be a slight nuance though, it's what happens if X = 0, since log(0) = -oo, which cannot be stored as an
> integer and may cause problems with the compiler. I would propose it return 0 as a special case, as this will fix most
> problems when it comes to loops and storage, and also ensure that FLog2 and CLog2 are "entire functions". To give an
> optimised example of FLog(2) in x86-64 assembly language:
> XOR EDX, EDX
> BSR EAX, ECX // ZF is set if ECX is zero
> CMOVZ EAX, EDX // Zero (EDX) written to result if ZF is set.
> Some kind of deep optimization could be used if the input is known to be non-zero and remove the instructions either
> side of BSR.
Here again: I would first try find a good pascal implementation of the two functions, then check, what FPC produces,
actually, FPC has already a BsrDWord function which is very similar to FLog2, though it returns 255 for 0. And then I
would propose to improve the optimizer to generate code close to hand coded assembler.
More information about the fpc-devel