[fpc-devel] Tail recursion optimization

Florian Klaempfl florian at freepascal.org
Tue Oct 10 11:09:24 CEST 2006


Jonas Maebe wrote:
> 
> On 10 okt 2006, at 10:34, Florian Klaempfl wrote:
> 
>> Not really. I didn't try it yet but it shouldn't mess up much. The
>> optimziation
>> is done completely on the node level, the pascal code would look like
>> http://www.hu.freepascal.org/fpcircbot/cgipastebin?msgid=156 except
>> that there
>> are temps involved to calculate the new parameters because calculating
>> one
>> parameter could require the original value of another one.
> 
> Which means that the parameter values cannot be properly seen in the
> debugger, no? 

This is a hidden step, but the callstack window isn't done properly, that's true.

> Also, do you actually insert goto/label nodes? That would
> degrade the performance of the register variable assignment because of
> the current limitations concerning flow analysis (i.e., for sufficiently
> complex routines, the tail recursion optimization may currently result
> in performance degradation rather than improvement compared to using
> regvars without it).

The recursive call is usually inside an if anyways so the ssa optimizer is lost
anyways, no? But as I said, for the usual benchmark case it works great (the
code is almost on pair with gcc after a quick look at it):

	.balign 16,0x90
.globl	P$ACKERMAN_ACK$LONGINT$LONGINT$$LONGINT
P$ACKERMAN_ACK$LONGINT$LONGINT$$LONGINT:
# Temps allocated between esp+0 and esp+16
# [ackermann.pp]
# [8] begin
	subl	$16,%esp
# Var M located in register ebx
# Var N located in register esi
# Var $result located in register edi
	movl	%ebx,(%esp)
	movl	%esi,4(%esp)
	movl	%edi,8(%esp)
	movl	%eax,%ebx
	movl	%edx,%esi
.Lj5:
# [9] if M = 0 then Ack := N+1
	testl	%ebx,%ebx
	jne	.Lj7
	movl	%esi,%eax
	incl	%eax
	movl	%eax,%edi
	jmp	.Lj10
.Lj7:
# [10] else if N = 0 then Ack := Ack(M-1, 1)
	testl	%esi,%esi
	jne	.Lj12
	movl	$1,%edx
	decl	%ebx
	movl	%edx,%esi
	jmp	.Lj5
.Lj12:
# [11] else Ack := Ack(M-1, Ack(M, N-1));
	movl	%esi,%edx
	decl	%edx
	movl	%ebx,%eax
	call	P$ACKERMAN_ACK$LONGINT$LONGINT$$LONGINT
	decl	%ebx
	movl	%eax,%esi
	jmp	.Lj5
.Lj21:
.Lj10:
# [12] end;
	movl	%edi,%eax
	movl	(%esp),%ebx
	movl	4(%esp),%esi
	movl	8(%esp),%edi
	addl	$16,%esp
	ret


BTW: Does the assembler optimizer track flag usage? Then we could remove some of
the 	

movl	%esi,%eax
incl	%eax

like pairs.

> 
>>> e.g., we
>>> could add -Oonostackframe at the end of the compiler switches for the
>>> RTL's object unit when compiling for x86).
>>
>> What we still need are switches in the sources to enable particular
>> optimizations.
> 
> {$optimization nostackframe}
> 
> should work.

Didn't know about it :)



More information about the fpc-devel mailing list