[fpc-pascal] code optimization
Jonas Maebe
jonas.maebe at elis.ugent.be
Fri Sep 24 14:35:45 CEST 2010
On 24 Sep 2010, at 11:48, Adrian Veith wrote:
> Changing to pointers reduces the amount of multiplications for
> accessing
> the nth element in an array - if you compare the delphi code to th fpc
> code on assembler base, this is the main difference in both generated
> codes.
Did you actually try replacing only the multiplications with lea's in
the assembler code generated by FPC (one lea to multiply by 5 and then
the times 4 during the load/store)? I did before posting my initial
reply because it also seemed to be the most logical explanation to me.
It turned out to be a red herring:
With imull $20:
# iterations: 26662054
no solution found
runtime: 10.75s
With "lea (%reg,%reg,4),%reg" followed by "movl (%xxx,%reg,4),
%yyy" (not just for mov, but for every single memory expression that
depends on an "imull $20"):
# iterations: 26662054
no solution found
runtime: 10.06s
Kylix 3 (~ Delphi 6.5):
# iterations: 26662054
no solution found
runtime: 6.65s
> Register allocation is on a comparable level for both versions.
Delphi keeps the "Bar" pointer in a register, while FPC spills it to
the stack. Because Bar is used in most of the most-executed
statements, this has a huge impact.
Jonas
More information about the fpc-pascal
mailing list