[fpc-devel] Rules about record types and internal storage

J. Gareth Moreton gareth at moreton-family.com
Thu Apr 2 15:24:47 CEST 2020

Hi everyone,

I'm just trying to get properly versed with the rules of record types 
over here: https://www.freepascal.org/docs-html/ref/refsu15.html

For standard record types (no "packed" modifier or compiler directives 
or anything), I'm wondering how many liberties that the compiler is 
allowed to take in storing its data.  Take the following example (from 
raybench.pas over here: http://runtimeterror.com/tools/raybench ):

   Vec = record
     X, Y, Z: Single;

function VLength(const V: Vec): Single;
   Result:=Sqrt(V.X*V.X + V.Y*V.Y + V.Z*V.Z);

If you wanted to get the highest speed possible for this routine on an 
Intel processor, one solution would be to pack X, Y and Z together, have 
an invisible 4th dummy field, and align the entire record to a 16 byte 
boundary, because then you could pass the entire record by value through 
an XMM register and the VLength operation can be performed with just a 
pair of VDPPS (dot product) and VSQRTSS (square root) instructions.

Obviously if you use 'packed record' or a compiler directive to change 
things up, this may not be possible, but at what points is the compiler 
allowed to make its own choice on what might be best?

Gareth aka. Kit

P.S. Of course you can force it by forcing the vector to be an m128 type 
and specifying vectorcall for x86_64-win64, but not everyone will know 
to do that and it gets unwieldly rather quickly.  Speaking of 
vectorcall, I'm wondering if we can introduce 'fastcall' as an alias for 
'ms_abi_default', mostly so if we follow Microsoft Visual C++'s example 
of automatically making all routines vectorcall (which is closer to the 
System V ABI used by linux and will make vectorisation easier), we can 
force the default one if we need a routine that, say, has to interface 
with a third-party library ('fastcall' under win32 is what the MS ABI is 
based off... first parameter in ECX, second in EDX and everything else 
on the stack).

