[fpc-devel] Tail recursion optimization
Florian Klaempfl
florian at freepascal.org
Tue Oct 10 11:09:24 CEST 2006
Jonas Maebe wrote:
>
> On 10 okt 2006, at 10:34, Florian Klaempfl wrote:
>
>> Not really. I didn't try it yet but it shouldn't mess up much. The
>> optimziation
>> is done completely on the node level, the pascal code would look like
>> http://www.hu.freepascal.org/fpcircbot/cgipastebin?msgid=156 except
>> that there
>> are temps involved to calculate the new parameters because calculating
>> one
>> parameter could require the original value of another one.
>
> Which means that the parameter values cannot be properly seen in the
> debugger, no?
This is a hidden step, but the callstack window isn't done properly, that's true.
> Also, do you actually insert goto/label nodes? That would
> degrade the performance of the register variable assignment because of
> the current limitations concerning flow analysis (i.e., for sufficiently
> complex routines, the tail recursion optimization may currently result
> in performance degradation rather than improvement compared to using
> regvars without it).
The recursive call is usually inside an if anyways so the ssa optimizer is lost
anyways, no? But as I said, for the usual benchmark case it works great (the
code is almost on pair with gcc after a quick look at it):
.balign 16,0x90
.globl P$ACKERMAN_ACK$LONGINT$LONGINT$$LONGINT
P$ACKERMAN_ACK$LONGINT$LONGINT$$LONGINT:
# Temps allocated between esp+0 and esp+16
# [ackermann.pp]
# [8] begin
subl $16,%esp
# Var M located in register ebx
# Var N located in register esi
# Var $result located in register edi
movl %ebx,(%esp)
movl %esi,4(%esp)
movl %edi,8(%esp)
movl %eax,%ebx
movl %edx,%esi
.Lj5:
# [9] if M = 0 then Ack := N+1
testl %ebx,%ebx
jne .Lj7
movl %esi,%eax
incl %eax
movl %eax,%edi
jmp .Lj10
.Lj7:
# [10] else if N = 0 then Ack := Ack(M-1, 1)
testl %esi,%esi
jne .Lj12
movl $1,%edx
decl %ebx
movl %edx,%esi
jmp .Lj5
.Lj12:
# [11] else Ack := Ack(M-1, Ack(M, N-1));
movl %esi,%edx
decl %edx
movl %ebx,%eax
call P$ACKERMAN_ACK$LONGINT$LONGINT$$LONGINT
decl %ebx
movl %eax,%esi
jmp .Lj5
.Lj21:
.Lj10:
# [12] end;
movl %edi,%eax
movl (%esp),%ebx
movl 4(%esp),%esi
movl 8(%esp),%edi
addl $16,%esp
ret
BTW: Does the assembler optimizer track flag usage? Then we could remove some of
the
movl %esi,%eax
incl %eax
like pairs.
>
>>> e.g., we
>>> could add -Oonostackframe at the end of the compiler switches for the
>>> RTL's object unit when compiling for x86).
>>
>> What we still need are switches in the sources to enable particular
>> optimizations.
>
> {$optimization nostackframe}
>
> should work.
Didn't know about it :)
More information about the fpc-devel
mailing list