[fpc-pascal] Recursion optimization by compiler

Roger Bailey rb at doc.ic.ac.uk
Sat Sep 4 09:07:57 CEST 2010


On Sep 2010, at 20:32, José Mejuto wrote:

[ some stuff about recursion - at the bottom if you want to read it ]

This a misunderstanding of way recursion can be flattened into a loop. It's especially not useful because the transformed version still uses a stack, so it doesn't execute in constant space.

Consider the classic factorial function:

function fact ( i : integer ) : integer ;
begin
if i := 0 then
  fact := 1
else
  fact := i * fact ( i - 1 )
end ;

with initial call "fact ( <some-value> )"

For an actual parameter value "i" this requires O (i) space, because you need to create stack frame to hold the current value of "i" for use after calculating the value "factorial ( i - 1 )" and the return address of the calling routine, (which is actually the same in every stack frame except in the one created by the top-level call).

You can see this by writing a top-level-call and plugging in the definition until you hit the base-case:

fact ( 4 ) 
-> 4 * fact ( 3 ) 
-> 4 * ( 3 * fact ( 2 ) ) 
-> 4 * ( 3 * ( 2 * fact ( 1 ) ) ) 
-> 4 * ( 3 * ( 2 * ( 1 * fact ( 0 ) ) ) )
-> 4 * ( 3 * ( 2 * ( 1 * 1 ) ) )

None of the multiplications can be done until the final call of "fact", so all the values must be preserved until then.

Now consider this version:

function accfact ( f , i : integer ) ; 
begin
if i = 0 then
  accfact := f
else
  accfact := accfact ( f * i , i - 1 )
end ;

with an initial call of "accfact ( 1 , <some-value> )"

This is the standard "accumulating parameter" technique. All the calculation is performed on the descent, no local variables need to be kept, and the return address can simply be the point following the top-level call. The new stackframe can overwrite the old one (or rather the new values for the actual parameters "f" and "i" can simply overwrite the old ones).

If you the previous trick of plugging the definition into a top-level call you immediately get:

accfact ( 1 , 4 )
-> accfact ( 1 * 4 , 3 )

and we can do the multiplication before making the recursive call. This is "tail recursion".

For trees it's a lot trickier to write the processing routines in this kind of tail-recursive form.

Suppose we have a data type "tree" that can be empty or hold a value (say an integer) plus two subtrees, either of which may of course be empty at any level. Without getting into the minutiae of data representation and pointer-twiddling, assume functions "value", "left" and "right" to return the value and the two subtrees respectively, and "isempty" to check if a tree is empty.

The obvious way to process all the integers in such a tree (perhaps by forming their sum) is:

function sum ( t : tree ) : integer ;
begin
if isempty ( t ) then
  sum := 0
else
  sum := value ( t ) + sum ( left ( t ) ) + sum ( right ( t ) )
end ;

For a tree with "n" nodes this executes in O (n) space.

The accumulating parameter version is a bit less intuitive than "accfact":

function accsum ( s : integer ; t : tree ) : integer ;
begin
if isempty ( t ) then
  accsum : = s
else
  accsum := accsum ( accsum ( s + value ( t ) , left ( t ) ) , right ( t ) )
end ;

with top-level call "accsum ( 0 , <some-tree> )"

Here we have double recursion, which for a tree of maximum depth "d" it can be executed in O (d) space. For a balanced tree of "n" nodes, this means O(log(n)) space, which is still a huge improvement on the original version. 

The key to making this work invisibly is provide higher-order functions to hide the recursion. If necessary they can be hand-coded in assembler and the compiler need never do any optimisation.

i.e given a type "munge = function (  a , b : <some-type> ) : <some-type>"

we define:

function mungetree ( b : <some-type> ; m : munge ; t : <tree-containing-<some-type>-values> ) : <some-type> ;
begin
if isempty ( t ) then
  mungetree : = b
else
  mungetree := mungetree ( mungetree ( munge ( s , value ( t ) ) , left ( t ) ) , right ( t ) )
end ;

and depending on what <some-type> is, and the actual function we supply for "m" (with of course an appropriate base-case value for "b") we can traverse any tree in O(log(n)) space, irrespective of the type of values in the tree or the type of operation performed on them.

Or we could if Pascal had a proper polymorphic type system. But it ain't, it's too old-fashioned.


On 3 Sep 2010, at 20:32, fpc-pascal-request at lists.freepascal.org wrote:

> From: Jos? Mejuto <joshyfun at gmail.com>
> Subject: Re: [fpc-pascal] Recursion optimization by compiler
> To: FPC-Pascal users discussions <fpc-pascal at lists.freepascal.org>
> Message-ID: <166212194.20100903141757 at gmail.com>
> Content-Type: text/plain; charset=ISO-8859-15
> 
> Hello FPC-Pascal,
> 
> Friday, September 3, 2010, 1:12:31 PM, you wrote:
> 
> BA> After my previous post, "TreeView and Nonrecursion", I'd tried to ask the same
> BA> topics in stackoverflow.com 
> BA> (http://stackoverflow.com/questions/3630047/treeview-control-and-nonrecursion)
> BA> and I got something new.
> BA> It is possible to recurse without using up the stack space.  Optimizing
> BA> compilers are able to accomplish certain types of recursion into  efficient
> BA> non-recursive loops, but the code has to be written into a tail recursion form.
> BA> Is there such a recursion optimization in FPC? If it is not enabled by default,
> BA> how to enable it?
> 
> That kind of "optimization" to me only seems interesting in two
> possible situations, when calling functions is high costly or when
> there is a very limited stack amount (really, really small). From my
> point of view if you need more that 1MB of stack size I'm quite sure
> that you are doing something wrong, because around 65000 recursion
> levels looks not very good.
> 
> The recursion "optimization" is being done using regular RAM as a
> stack:
> 
> function Recursive(var a: integer): boolean;
> begin
>  if a=1000 then begin
>    Result:=false;
>    exit;
>  end else begin
>    inc(a);
>    Result:=Recursive(a);
>  end;
> end;
> 
> Function Iterative(var a: integer): boolean;
> var
>  Stack: TStack;
>  v: integer;
> begin
>  stack.Push(a);
>  while true do begin
>    v:=stack.pop;
>    if v=1000 then begin
>      Result:=true;
>      a:=v;
>      break;
>    end else begin
>      inc(v);
>      Stack.push(v);
>    end;
>  end;
>  stack.free;
> end;
> 
> To me the first one looks more clear...
> 
> -- 
> Best regards,
> José




More information about the fpc-pascal mailing list