<!DOCTYPE html>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>To show current progress, the .section
.text.n_system.math.vectors$_$tvector3d_$__$$_$plus$tvector3d$tvector3d$$tvector3d,"ax"
      routine ("class operator TVector3D.+(const aVector1, aVector2:
      TVector3D): TVector3D;") on x86_64-win64 - before (this performs a
      component-wise addition of two 4-component vectors... inputs are
      pointed to by %rdx and %r8, and the result is pointed to by %rcx):</p>
    <p>    movss    (%rdx),%xmm0<br>
          addss    (%r8),%xmm0<br>
          movss    %xmm0,(%rcx)<br>
          movss    4(%rdx),%xmm0<br>
          addss    4(%r8),%xmm0<br>
          movss    %xmm0,4(%rcx)<br>
          movss    8(%rdx),%xmm0<br>
          addss    8(%r8),%xmm0<br>
          movss    %xmm0,8(%rcx)<br>
          movss    12(%rdx),%xmm0<br>
          addss    12(%r8),%xmm0<br>
          movss    %xmm0,12(%rcx)<br>
          ret<br>
    </p>
    <p>After:</p>
    <p>    movups    (%rdx),%xmm0<br>
          addps    (%r8),%xmm0<br>
          movups    %xmm0,(%rcx)<br>
          ret<br>
    </p>
    <p>(Note that this unit is NOT suitable for 3D graphics because the
      W-coordinate is treated the same as X, Y and Z rather than being
      kept at 1, say)<br>
    </p>
    <p>It's not perfect though.  When vectorcall or the System V ABI is
      concerned, it can produce worse code because of constantly needing
      to break up the vector to manipulate individual components, and
      there may be some unnecessary reads and writes between the stack
      and XMM registers, but I'm working on it, bit by it.</p>
    <p>My current branch can be found here:
<a class="moz-txt-link-freetext" href="https://gitlab.com/CuriousKit/optimisations/-/commits/ucomplex-x86-vector">https://gitlab.com/CuriousKit/optimisations/-/commits/ucomplex-x86-vector</a><br>
    </p>
    <p>Kit<br>
    </p>
    <div class="moz-cite-prefix">On 23/08/2024 17:56, J. Gareth Moreton
      via fpc-devel wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:82c22f9f-b3b1-411f-af0d-45e06359b187@moreton-family.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <p>Hi everyone,</p>
      <p>So I'm getting ready to showcase my current vector work to
        others.  I do have a question though...</p>
      <p>Currently the feature is locked behind "-Sv", since this is
        specificially "support vector processing" and the feature is
        still experimental and inefficient in places, but is this the
        right approach?  I ask because the -S switches are specifically
        syntax options, not code generation options (I do wonder exactly
        what syntax it enables).  Also, at least with the "make" script,
        it skips whole program optimisation and some of the packages.</p>
      <p>Should I use a compiler definition instead like
        "-dX86_VECTORS"?  That way, the feature can easily be turned
        off.<br>
      </p>
      <p>Kit<br>
      </p>
      <div class="moz-cite-prefix">On 21/08/2024 15:59, J. Gareth
        Moreton via fpc-devel wrote:<br>
      </div>
      <blockquote type="cite"
cite="mid:f2645374-bc88-4ea5-835d-3c08490aac19@moreton-family.com">
        <meta http-equiv="content-type"
          content="text/html; charset=UTF-8">
        <p>Hi everyone,</p>
        <p>Just thought I'd give a heads-up on what's happening with me
          and the compiler improvements.  Also, I've been busy with
          contract work and have just had some minor surgery, so I'm not
          running on all cylinders currently.</p>
        <ul>
          <li>Still waiting on administrator comments and feedback on my
            assembly-level CSE feature (a couple of years old now) and
            the first part of pure functions.  Both of these should be
            ready to merge unless someone found a bug that breaks things
            (someone did find some examples with pure functions which
            have since been fixed).<br>
          </li>
          <li>Haven't solved the SEH unwinding problem on aarch64-win64
            yet.  This is quite a tough one!</li>
          <li>Also working on vectorisation for x86_64 platforms.  I've
            got it working on win64, and can vectorise two-operand
            commutative operations like addition and multiplication,
            although some of the generated code is less than optimal
            (unnecessarily copying vectors to the stack).  Linux (and
            other OSes that use the System V ABI) is taking a bit longer
            since it stores pairs of floats in single XMM registers even
            without vectorisation code, and some of the internal
            procedures can't properly handle these if the desire is to
            combine a pair of these such registers (so 4 singles) into a
            single XMM vector, especially where shuffling is involved.</li>
        </ul>
        <p>I'll let you know the progress.</p>
        <p>Kit<br>
        </p>
        <div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br>
          <table style="border-top: 1px solid #D3D4DE;">
            <tbody>
              <tr>
                <td style="width: 55px; padding-top: 13px;"><a
href="https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient"
                    target="_blank" moz-do-not-send="true"><img
src="https://s-install.avcdn.net/ipm/preview/icons/icon-envelope-tick-round-orange-animated-no-repeat-v1.gif"
                      alt="" style="width: 46px; height: 29px;"
                      moz-do-not-send="true" width="46" height="29"></a></td>
                <td
style="width: 470px; padding-top: 12px; color: #41424e; font-size: 13px; font-family: Arial, Helvetica, sans-serif; line-height: 18px;">Virus-free.<a
href="https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient"
                    target="_blank" style="color: #4453ea;"
                    moz-do-not-send="true">www.avast.com</a></td>
              </tr>
            </tbody>
          </table>
          <a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1"
            height="1" moz-do-not-send="true"> </a></div>
        <br>
        <fieldset class="moz-mime-attachment-header"></fieldset>
        <pre class="moz-quote-pre" wrap="">_______________________________________________
fpc-devel maillist  -  <a
        class="moz-txt-link-abbreviated moz-txt-link-freetext"
        href="mailto:fpc-devel@lists.freepascal.org"
        moz-do-not-send="true">fpc-devel@lists.freepascal.org</a>
<a class="moz-txt-link-freetext"
href="https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel"
        moz-do-not-send="true">https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel</a>
</pre>
      </blockquote>
      <br>
      <fieldset class="moz-mime-attachment-header"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
fpc-devel maillist  -  <a class="moz-txt-link-abbreviated" href="mailto:fpc-devel@lists.freepascal.org">fpc-devel@lists.freepascal.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel">https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel</a>
</pre>
    </blockquote>
  </body>
</html>