[fpc-devel] Optimizations
Martin Frb
lazarus at mfriebe.de
Fri Jan 24 02:03:02 CET 2014
On 24/01/2014 00:17, August Oktobar wrote:
> "2) Reporter's assumption about fstp is wrong: the first fstp
> instruction removes value from fpu stack, so it cannot be used for the
> second time without first reloading value onto stack."
>
> Compiler should reuse loaded value (a[i]) and store to a[i] using
> fstl, then fstpl to a[i+1]
That is why Sergei wrote "typical common subexpression elimination". I
am sure it is a todo for the fpc team.
Also In this case optimizing this to be re-used is a small (smaller)
gain. You still have plenty of statements that recalculate the address
of an array element, using multiplication.
Introducing a temporary pointer to a[i], and using addition in each run
of the loop to increment it, will gain a lot more. (Again, I am sure it
is a todo).
Until that is done, your best choice, if you need the speed is to do
this by hand:
if cnt = 0 then exit;
tmpptrA := @a[0];
tmpptrB := @b[0];
for i := 0 to cnt - 1 do
begin
tmpptrA^ := tmpptrA^ + tmpptrB^;
tmpptrA2 := tmpptrA^;
inc(tmpptrA); // assuming a typed pointer
tmpptrA^ := tmpptrA2^;
inc(tmpptrB); // assuming a typed pointer
end;
or better
if cnt = 0 then exit;
tmpptrA := @a[0];
tmpptrB := @b[0];
for i := 0 to cnt - 1 do
begin
tmpVAlue := tmpptrA^ + tmpptrB^;
tmpptrA^ := tmpVAlue;
inc(tmpptrA); // assuming a typed pointer
tmpptrA^ := tmpVAlue;
inc(tmpptrB); // assuming a typed pointer
end;
It looses readability, so keep the good code as comment.
There is a bigger example, where exactly that was done, because FPCs
optimization was not sufficient enough for what the author wanted.
http://bugs.freepascal.org/view.php?id=10275
>
>
> On Fri, Jan 24, 2014 at 12:26 AM, Sergei Gorelkin
> <sergei_gorelkin at mail.ru <mailto:sergei_gorelkin at mail.ru>> wrote:
>
>
> 1) You are right that it's not the job for peephole analyzer, it
> is typical common subexpression elimination.
> 2) Reporter's assumption about fstp is wrong: the first fstp
> instruction removes value from fpu stack, so it cannot be used for
> the second time without first reloading value onto stack.
> 3) The assignments of floating-point values are currently being
> generated using integer instructions, hence the subsequent code.
> This way it doesn't depend on number of available FPU registers,
> which is hard to know at any point.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20140124/a0803395/attachment.html>
More information about the fpc-devel
mailing list