[fpc-devel] Tail recursion optimization

Tue Oct 10 14:12:10 CEST 2006

On 10 okt 2006, at 13:46, Florian Klaempfl wrote:

> Practical argument:
> the assembler code _is_ better for the code I tested and up to as  
> twice as fast
> as the original one.

That's indeed true for an extremely small function with so few local  
variables/parameters that even on an i386 it doesn't need spilling in  
the absence of register allocation optimizations. That's why I said  
it *may* currently degrade performance in more complex functions (it  
also may not, I really don't know, it was just a remark).

> Theoretical argument:
> - using the tail goto you've one function with one set of variables  
> being active
> across the the whole function
> - using a recursive call you've at least two sets of variables: the  
> caller and
> the callee ones. Though one set is only active at a limited part of  
> the
> function, the set of the caller is still in use while the callee is  
> called
> though they (the caller variables) are spilled.

The first point is just as much a downside as an upside in the  
current situation, because parameters/variables which are used  
without being destroyed by some function call can normally be put in  
a (reusable) volatile register. Now they need a non-volatile register  
during the entire function.

The fact that you get such a speedup is in my view mainly an  
indication of the fact that the function barely does anything, and  
that almost half the time is spent in setting up and tearing down  
stack frames. So it's logical that register allocation has little  
influence and that optimizations which remove this stack frame logic  
help a lot.

> I guess it could be used to change to code above to
> leal 1(%esi),%eax
> ?

Yes.

> Or is this slower on some CPUs?

I don't know.

Jonas