[fpc-pascal] Efficiency of generated code [x86_64]

Fri Jun 24 19:16:29 CEST 2011

Hi,

I'm puzzled by some of the code generated for x64. Came across this 
earlier post;
http://www.hu.freepascal.org/lists/fpc-pascal/2005-March/008175.html

Compiling the simple example with the loop in a function I get much 
leaner & meaner than the [i386] assembler in the original post, but I 
had to use O3 and that separate function to get full use of xmm 
registers instead of the stack.

Program tttt;

Function loop (A,B : double) : double;
Var X : LongInt;
Begin
     For X := 0  to 10000000 do
     Begin
         A := A + X;
         A := A * B;
     End;
     loop := A;
End;

Var A,B : double;
  Begin
     A := 0;
     B := 0.9;
     loop (A,B);
     WRITELN (loop (A,B):0:9);
end.

Looking at the assembler loop code

# Var A located in register xmm0
# Var B located in register xmm1
# Var $result located in register xmm0
# Var X located in register eax    //   AND xmm2 !

# [7] For X := 0  to 10000000 do
     movl    $0,%eax
     decl    %eax
     .balign 4,0x90
.Lj7:
     incl    %eax
# [9] A := A + X;
     cvtsi2sdl    %eax,%xmm2
     addsd    %xmm0,%xmm2
     movsd    %xmm2,%xmm0
# [10] A := A * B;
     movsd    %xmm0,%xmm2
     mulsd    %xmm1,%xmm2
     movsd    %xmm2,%xmm0
     cmpl    $10000000,%eax
     jl    .Lj7
# [14] end;
     movsd    %xmm0,%xmm0
     addq    $24,%rsp
     ret

I am wondering what is the point of all the xmm2 stuff, apart from the 
initial transfer of X from %eax?  I can't see the point of it. Not set 
any debugging options.  What is wrong with the following?

# [9] A := A + X;
     cvtsi2sdl  %eax,%xmm2
     addsd  %xmm2,%xmm0
# [10] A := A * B;
     mulsd  %xmm1,%xmm0
     cmpl    $10000000,%eax
     jl    .Lj7
# [14] end;

Also puzzled by the final
movsd %xmm0,%xmm0
What does this do?

I would really like to be able to generate optimal (ie minimal) xmm code 
from Pascal without dropping into assembler. Are there any other 
compiler switches that would help?

Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-pascal/attachments/20110624/5e15c61f/attachment.html>