[fpc-devel] Difficulty in specifying record alignment
J. Gareth Moreton
gareth at moreton-family.com
Mon Oct 21 21:42:13 CEST 2019
On 21/10/2019 20:00, Florian Klämpfl wrote:
> What's the problem with
>
> {$push}
> {$codealign RECORDMIN=16}
> type complex = record
> re : real;
> im : real;
> end;
> {$pop}
>
> ?
Hi Florian,
I tried that, but that puts each individual field on a 16-byte boundary
(and the documentation for RECORDMIN implies that this is correct
behaviour), which is why only an array element works. This occurs even
if "packed record" is used. The assembly language confirms this:
movsd U_$P$COMPLEX_$$_Z(%rip),%xmm0
addsd U_$P$COMPLEX_$$_X(%rip),%xmm0
movsd %xmm0,40(%rsp)
movsd U_$P$COMPLEX_$$_Z+16(%rip),%xmm0
addsd U_$P$COMPLEX_$$_X+16(%rip),%xmm0
movsd %xmm0,56(%rsp)
movq 40(%rsp),%rax
movq %rax,U_$P$COMPLEX_$$_Z(%rip)
movq 48(%rsp),%rax
movq %rax,U_$P$COMPLEX_$$_Z+8(%rip)
movq 56(%rsp),%rax
movq %rax,U_$P$COMPLEX_$$_Z+16(%rip)
The section with 48(%rsp) seems to relate to those 8 filler bytes and I
do question its validity, if not its necessity, since the rest of the
subroutine implies that the data at 48(%rsp) is undefined.
In the compiler that sets up XMM parameters
(compiler/x86_64/cpupara.pas, line 945), the following code dictates
whether a packed vector register is used or two separate registers:
if Assigned(parentdef) and
((parentdef.aggregatealignment mod 16) = 0) and ((byte_offset mod
parentdef.aggregatealignment) <> 0) then
{ Aligned vector of type double }
classes[0].typ:=X86_64_SSEUP_CLASS
else
classes[0].typ:=X86_64_SSEDF_CLASS;
(parentdef is the "complex" type in this case, while the regular def is
one of the real-type fields).
"byte_offset" for 're' is 0 and for 'im' is 8. The problem is that
parentdef.aggregatealignment, by default, is equal to 8, which indicates
that, as far as the compiler is concerned, it is only guaranteed to be
aligned to an 8-byte boundary, not 16 (the "mod" expression checks
this). When using the RECORDMIN=16 construct above, the
aggregatealignment does increase to 16, but "byte_offset" for 'im' also
becomes 16. At the moment, there's no way to configure a record type to
have an aggregate alignment of 16 and a field alignment of 8.
Regarding the stack being aligned to 16-byte boundaries, while this can
be guaranteed for local variables and formal parameters, the actual
parameters passed into the function may not be aligned (e.g. when
deferencing a pointer on the heap after calling, say, "New(complex);"),
hence why the compiler can only go by the 8-byte aggregate alignment.
Gareth aka. Kit
--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
More information about the fpc-devel
mailing list