<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body bgcolor="#ffffff" text="#000000">
<font size="-1">Hi,<br>
<br>
I'm puzzled by some of the code generated for x64. Came across this
earlier post;<br>
<a class="moz-txt-link-freetext" href="http://www.hu.freepascal.org/lists/fpc-pascal/2005-March/008175.html">http://www.hu.freepascal.org/lists/fpc-pascal/2005-March/008175.html</a><br>
<br>
Compiling the simple example with the loop in a function I get much
leaner & meaner than the [i386] assembler in the original post, but
I had to use O3 and that separate function to get full use of xmm
registers instead of the stack. <br>
<br>
<br>
Program tttt; <br>
<br>
Function loop (A,B : double) : double; <br>
Var X : LongInt; <br>
Begin <br>
For X := 0 to 10000000 do <br>
Begin <br>
A := A + X; <br>
A := A * B; <br>
End; <br>
loop := A; <br>
End; <br>
<br>
Var A,B : double; <br>
Begin <br>
A := 0; <br>
B := 0.9; <br>
loop (A,B); <br>
WRITELN (loop (A,B):0:9); <br>
end.<br>
<br>
<br>
Looking at the assembler loop code<br>
<br>
# Var A located in register xmm0 <br>
# Var B located in register xmm1 <br>
# Var $result located in register xmm0 <br>
# Var X located in register eax // AND xmm2 !<br>
<br>
# [7] For X := 0 to 10000000 do <br>
movl $0,%eax <br>
decl %eax <br>
.balign 4,0x90 <br>
.Lj7: <br>
incl %eax <br>
# [9] A := A + X; <br>
cvtsi2sdl %eax,%xmm2 <br>
addsd %xmm0,%xmm2 <br>
movsd %xmm2,%xmm0 <br>
# [10] A := A * B; <br>
movsd %xmm0,%xmm2 <br>
mulsd %xmm1,%xmm2 <br>
movsd %xmm2,%xmm0 <br>
cmpl $10000000,%eax <br>
jl .Lj7 <br>
# [14] end; <br>
movsd %xmm0,%xmm0 <br>
addq $24,%rsp <br>
ret <br>
<br>
<br>
</font><font size="-1">I am wondering </font><font size="-1">what is
the point of all the xmm2 stuff, apart from the initial transfer of X
from %eax? I can't see the point of it. Not set any debugging
options. What is wrong with the following? <br>
<br>
# [9] A := A + X; <br>
cvtsi2sdl %eax,%xmm2 <br>
addsd %xmm2,%xmm0 <br>
# [10] A := A * B; <br>
mulsd %xmm1,%xmm0<br>
cmpl $10000000,%eax <br>
jl .Lj7 <br>
# [14] end; <br>
<br>
Also puzzled by the final<br>
movsd %xmm0,%xmm0 <br>
What does this do?<br>
<br>
I would really like to be able to generate optimal (ie minimal) xmm
code from Pascal without dropping into assembler. Are there any other
compiler switches that would help?<br>
<br>
<br>
Peter<br>
</font>
</body>
</html>