[fpc-devel] Question on updating FPC packages

Marco van de Voort fpc at pascalprogramming.org
Tue Oct 29 12:06:01 CET 2019

Op 2019-10-27 om 09:02 schreef Florian Klämpfl:
> I guess you're right.  It just seems weird because the System V ABI 
> was designed from the start to use the MM registers fully, so long as 
> the data is aligned.  In effect, it had vectorcall wrapped into its 
> design from the start.  Granted, vectorcall has some advantages and 
> can deal with relatively complex aggregates that the System V ABI 
> cannot handle (for example, a record type that contains a normal 
> vector and information relating to bump mapping).
>> I just hoped that making updates to uComplex, while ensuring existing 
>> Pascal code still compiles, would help take advantage of modern ABI 
>> designs.
> Is there currently any example which shows that vectorcall has any 
> advantage with FPC? Else I would propose first to make FPC able to 
> take advantage of it and then talk about if we really add vectorcall. 
> Currently I fear, FPC gets only into trouble when using vectorcall as 
> it tries first to push everything into one xmm register and then 
> splits this again in the callee.

Nils Haeck's FFT unit might be interesting. (same guy as nativejpg unit 
iirc, http://www.simdesign.nl)

It is a D7 language level unit that uses a complex record and simple 
procedures as options. It should be easy to transpose to ucomplex. It is 
quite hll and switchable between single and double. (I use it in single 
mode, but to test vectorcall, obviously double mode would be best?)

And it has routines that do a variety of complex operations.

procedure FFT_5(var Z: array of TComplex); // usage of open array is to 
make things generic. Could be solved differently.

   T1, T2, T3, T4, T5: TComplex;
   M1, M2, M3, M4, M5: TComplex;
   S1, S2, S3, S4, S5: TComplex;
   T1 := ComplexAdd(Z[1], Z[4]);
   T2 := ComplexAdd(Z[2], Z[3]);
   T3 := ComplexSub(Z[1], Z[4]);
   T4 := ComplexSub(Z[3], Z[2]);

   T5   := ComplexAdd(T1, T2);
   Z[0] := ComplexAdd(Z[0], T5);
   M1   := ComplexScl(c51, T5);
   M2   := ComplexScl(c52, ComplexSub(T1, T2));

   M3.Re := -c53 * (T3.Im + T4.Im);  // replace by 
i*add(t3,t4).scale(c53-i*c53) ?
   M3.Im :=  c53 * (T3.Re + T4.Re);
   M4.Re := -c54 * T4.Im;
   M4.Im :=  c54 * T4.Re;
   M5.Re := -c55 * T3.Im;
   M5.Im :=  c55 * T3.Re;

   S3 := ComplexSub(M3, M4);
   S5 := ComplexAdd(M3, M5);;
   S1 := ComplexAdd(Z[0], M1);
   S2 := ComplexAdd(S1, M2);
   S4 := ComplexSub(S1, M2);

   Z[1] := ComplexAdd(S2, S3);
   Z[2] := ComplexAdd(S4, S5);
   Z[3] := ComplexSub(S4, S5);
   Z[4] := ComplexSub(S2, S3);

More information about the fpc-devel mailing list