<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 24/01/2014 00:17, August Oktobar
wrote:<br>
</div>
<blockquote
cite="mid:CABQ5zt0yMCcPLkmqUMmEMV3MzbnYO3iRPX3GFSHRmA+2RX1aLQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>"2) Reporter's assumption about fstp is wrong: the first
fstp instruction removes value from fpu stack, so it cannot be
used for the second time without first reloading value onto
stack."<br>
<br>
</div>
Compiler should reuse loaded value (a[i]) and store to a[i]
using fstl, then fstpl to a[i+1]</div>
</blockquote>
<br>
That is why Sergei wrote "typical common subexpression elimination".
I am sure it is a todo for the fpc team.<br>
<br>
Also In this case optimizing this to be re-used is a small (smaller)
gain. You still have plenty of statements that recalculate the
address of an array element, using multiplication.<br>
<br>
Introducing a temporary pointer to a[i], and using addition in each
run of the loop to increment it, will gain a lot more. (Again, I am
sure it is a todo).<br>
<br>
Until that is done, your best choice, if you need the speed is to do
this by hand:<br>
<br>
if cnt = 0 then exit;<br>
tmpptrA := @a[0];<br>
tmpptrB := @b[0];<br>
for i := 0 to cnt - 1 do<br>
begin<br>
tmpptrA^ := tmpptrA^ + tmpptrB^;<br>
tmpptrA2 := tmpptrA^;<br>
inc(tmpptrA); // assuming a typed pointer<br>
tmpptrA^ := tmpptrA2^;<br>
inc(tmpptrB); // assuming a typed pointer<br>
end;<br>
<br>
or better<br>
if cnt = 0 then exit;<br>
tmpptrA := @a[0];<br>
tmpptrB := @b[0];<br>
for i := 0 to cnt - 1 do<br>
begin<br>
tmpVAlue := tmpptrA^ + tmpptrB^;<br>
tmpptrA^ := tmpVAlue;<br>
inc(tmpptrA); // assuming a typed pointer<br>
tmpptrA^ := tmpVAlue;<br>
inc(tmpptrB); // assuming a typed pointer<br>
end;<br>
<br>
It looses readability, so keep the good code as comment.<br>
<br>
There is a bigger example, where exactly that was done, because FPCs
optimization was not sufficient enough for what the author wanted.<br>
<a class="moz-txt-link-freetext" href="http://bugs.freepascal.org/view.php?id=10275">http://bugs.freepascal.org/view.php?id=10275</a><br>
<br>
<br>
<blockquote
cite="mid:CABQ5zt0yMCcPLkmqUMmEMV3MzbnYO3iRPX3GFSHRmA+2RX1aLQ@mail.gmail.com"
type="cite">
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Fri, Jan 24, 2014 at 12:26 AM,
Sergei Gorelkin <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:sergei_gorelkin@mail.ru" target="_blank">sergei_gorelkin@mail.ru</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div class="h5"><br>
</div>
</div>
1) You are right that it's not the job for peephole
analyzer, it is typical common subexpression elimination.<br>
2) Reporter's assumption about fstp is wrong: the first fstp
instruction removes value from fpu stack, so it cannot be
used for the second time without first reloading value onto
stack.<br>
3) The assignments of floating-point values are currently
being generated using integer instructions, hence the
subsequent code. This way it doesn't depend on number of
available FPU registers, which is hard to know at any point.<br>
</blockquote>
</div>
</div>
</blockquote>
<br>
</body>
</html>