[fpc-devel] Difficulty in specifying record alignment

J. Gareth Moreton gareth at moreton-family.com
Mon Oct 21 21:42:13 CEST 2019


On 21/10/2019 20:00, Florian Klämpfl wrote:
> What's the problem with
>
> {$push}
> {$codealign RECORDMIN=16}
> type complex = record
>                       re : real;
>                       im : real;
> end;
> {$pop}
>
> ?

Hi Florian,

I tried that, but that puts each individual field on a 16-byte boundary 
(and the documentation for RECORDMIN implies that this is correct 
behaviour), which is why only an array element works. This occurs even 
if "packed record" is used.  The assembly language confirms this:

     movsd    U_$P$COMPLEX_$$_Z(%rip),%xmm0
     addsd    U_$P$COMPLEX_$$_X(%rip),%xmm0
     movsd    %xmm0,40(%rsp)
     movsd    U_$P$COMPLEX_$$_Z+16(%rip),%xmm0
     addsd    U_$P$COMPLEX_$$_X+16(%rip),%xmm0
     movsd    %xmm0,56(%rsp)
     movq    40(%rsp),%rax
     movq    %rax,U_$P$COMPLEX_$$_Z(%rip)
     movq    48(%rsp),%rax
     movq    %rax,U_$P$COMPLEX_$$_Z+8(%rip)
     movq    56(%rsp),%rax
     movq    %rax,U_$P$COMPLEX_$$_Z+16(%rip)

The section with 48(%rsp) seems to relate to those 8 filler bytes and I 
do question its validity, if not its necessity, since the rest of the 
subroutine implies that the data at 48(%rsp) is undefined.

In the compiler that sets up XMM parameters 
(compiler/x86_64/cpupara.pas, line 945), the following code dictates 
whether a packed vector register is used or two separate registers:

                     if Assigned(parentdef) and 
((parentdef.aggregatealignment mod 16) = 0) and ((byte_offset mod 
parentdef.aggregatealignment) <> 0) then
                       { Aligned vector of type double }
                       classes[0].typ:=X86_64_SSEUP_CLASS
                     else
                       classes[0].typ:=X86_64_SSEDF_CLASS;

(parentdef is the "complex" type in this case, while the regular def is 
one of the real-type fields).

"byte_offset" for 're' is 0 and for 'im' is 8.  The problem is that 
parentdef.aggregatealignment, by default, is equal to 8, which indicates 
that, as far as the compiler is concerned, it is only guaranteed to be 
aligned to an 8-byte boundary, not 16 (the "mod" expression checks 
this).  When using the RECORDMIN=16 construct above, the 
aggregatealignment does increase to 16, but "byte_offset" for 'im' also 
becomes 16.  At the moment, there's no way to configure a record type to 
have an aggregate alignment of 16 and a field alignment of 8.

Regarding the stack being aligned to 16-byte boundaries, while this can 
be guaranteed for local variables and formal parameters, the actual 
parameters passed into the function may not be aligned (e.g. when 
deferencing a pointer on the heap after calling, say, "New(complex);"), 
hence why the compiler can only go by the 8-byte aggregate alignment.

Gareth aka. Kit


-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



More information about the fpc-devel mailing list