[fpc-pascal] code optimization
Jonas Maebe
jonas.maebe at elis.ugent.be
Mon Sep 27 00:48:21 CEST 2010
On 26 Sep 2010, at 23:25, Florian Klämpfl wrote:
> Am 24.09.2010 15:36, schrieb Jonas Maebe:
>>
>> Correction: Delphi keeps the hidden parent frame pointer parameter in a
>> register (which is required every time Bar is accessed), while FPC puts
>> it on the stack. The end result is the same though: 1 extra memory
>> access every time Bar or any other variable from the parent procedure is
>> accessed.
>
> I've almost a patch ready which should solve this. Furthermore, I
> improved cse a little bit. Nevertheless, this safes only 5% on my
> machine. Unfortunatly, I found no proper way to enable cse on dyn. array
> expressions, this should help for another 5%.
I just quickly hacked support for turning the framepointer into a regvar into the compiler (always allow, instead of only allow when safe -- which is no problem for this test program), and also added SSU support for it. This still made FPC put the frame pointer on the stack and therefore didn't help though.
This was caused by another problem: FPC doesn't perform SSU/SSA inside any control flow construct, and the main body of the ExtendPartialSolution() is inside the else-path of an if-expression (so the framepointer was never SSU'd and spilled to the stack since its virtual register conflicted with everything else). Since the then-branch ends with "exit", I simply removed the "else" (and added a semicolon after the "end" of the if-block). Then I get the following results compared to Kylix' 6.51s and FPC's original 10.75s:
a) -O2: 9.50s
b) -O2 -Ooasmcse: 8.27s
c) -O3 -Ooasmcse: 7.85s
So to better optimise the original program with regular optimisation options, I guess we need
a) full SSA support
b) better (node) CSE support
> I think it should be also doable to use
> ebp as general purpose register if the stack frame is omitted
Yes, it should be possible, but when I tried this in the past I always got crashes. I didn't manage to figure out what was going wrong.
> Did anybody (Jonas?) profile which the real hot spot lines are?
In the original version (i.e., without any changes to the compiler or the program):
20.7% 107 if (UnsortedRectangle[i].Width <= MinValleyWidth)
6.9% 121 if (UnsortedRectangle[i].width <= MinValleyWidth)
5.7% 86 (Bar[PrevBar].Height > Bar[i].Height) and
5.4% 94 NextBar := Bar[i].Next;
5.1% 84 while NextBar <> 0 do begin
4.8% 87 (Bar[NextBar].Height > Bar[i].Height)
4.4% 109 CurrentRectangle := UnsortedRectangle[CurrentRectangle].NextUnusedRectangle
3.8% 97 MinValleyHeight := min(Bar[Bar[MinValley].Prev].Height, Bar[Bar[MinValley].Next].Height)- Bar[MinValley].Height;
3.2% 105 while CurrentRectangle <> 0 do begin
2.7% 85 if (Bar[i].Width < MinValleyWidth) and
2.1% 106 i := CurrentRectangle;
2.1% 92 PrevBar := i;
(second column is the line number)
Jonas
More information about the fpc-pascal
mailing list