[fpc-devel] Optimizations

Martin Frb lazarus at mfriebe.de
Fri Jan 24 02:03:02 CET 2014


On 24/01/2014 00:17, August Oktobar wrote:
> "2) Reporter's assumption about fstp is wrong: the first fstp 
> instruction removes value from fpu stack, so it cannot be used for the 
> second time without first reloading value onto stack."
>
> Compiler should reuse loaded value (a[i]) and store to a[i] using 
> fstl, then fstpl to a[i+1]

That is why Sergei wrote "typical common subexpression elimination". I 
am sure it is a todo for the fpc team.

Also In this case optimizing this to be re-used is a small (smaller) 
gain. You still have plenty of statements that recalculate the address 
of an array element, using multiplication.

Introducing a temporary pointer to a[i], and using addition in each run 
of the loop to increment it, will gain a lot more. (Again, I am sure it 
is a todo).

Until that is done, your best choice, if you need the speed is to do 
this by hand:

if cnt = 0 then exit;
tmpptrA := @a[0];
tmpptrB := @b[0];
for i := 0 to cnt - 1 do
     begin
       tmpptrA^ := tmpptrA^ + tmpptrB^;
       tmpptrA2 := tmpptrA^;
       inc(tmpptrA); // assuming a typed pointer
       tmpptrA^ := tmpptrA2^;
       inc(tmpptrB); // assuming a typed pointer
     end;

or better
if cnt = 0 then exit;
tmpptrA := @a[0];
tmpptrB := @b[0];
for i := 0 to cnt - 1 do
     begin
       tmpVAlue := tmpptrA^ + tmpptrB^;
       tmpptrA^ := tmpVAlue;
       inc(tmpptrA); // assuming a typed pointer
       tmpptrA^ := tmpVAlue;
       inc(tmpptrB); // assuming a typed pointer
     end;

It looses readability, so keep the good code as comment.

There is a bigger example, where exactly that was done, because FPCs 
optimization was not sufficient enough for what the author wanted.
http://bugs.freepascal.org/view.php?id=10275


>
>
> On Fri, Jan 24, 2014 at 12:26 AM, Sergei Gorelkin 
> <sergei_gorelkin at mail.ru <mailto:sergei_gorelkin at mail.ru>> wrote:
>
>
>     1) You are right that it's not the job for peephole analyzer, it
>     is typical common subexpression elimination.
>     2) Reporter's assumption about fstp is wrong: the first fstp
>     instruction removes value from fpu stack, so it cannot be used for
>     the second time without first reloading value onto stack.
>     3) The assignments of floating-point values are currently being
>     generated using integer instructions, hence the subsequent code.
>     This way it doesn't depend on number of available FPU registers,
>     which is hard to know at any point.
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20140124/a0803395/attachment.html>


More information about the fpc-devel mailing list