[fpc-devel] Vectorization

J. Gareth Moreton gareth at moreton-family.com
Sun Dec 10 20:01:05 CET 2017

The idea I had currently (this is without 
looking at any previous theory) was to use 
a kind of sliding window, similar to how 
ZIP and other LZ77-based algorithms work 
when compressing repeating strings, to 
look backwards in the current block for a 
matching command and then scan forward. If 
the scan gets up to the instruction right 
before the starting point, then it's 
potential for vectorisable code. Using the 
previous example:

movss 16(%rsp),%xmm0
addss 32(%rsp),%xmm0
movss %xmm0,(%rax)
movss 20(%rsp),%xmm0
addss 36(%rsp),%xmm0
movss %xmm0,4(%rax)

Starting at the 4th command, it looks back 
to find a match in the 1st command, albeit 
with Ann address that differs only by 4. 
As it scans forward, it finds similar 
matches in subsequent commands, and 
eventually realises the entire block could 
potentially be vectorised. If it 
continues, it finds the code fragment 
repeats 4 times and can be vectorised with 
little difficulty. Being only SSE commands 
helps too.


P.S. I did look at the loop unrolling 
code, but it almost never triggers due to 
the small instruction cache that's 
assumed. For x86-64, is it safe to assume 
a cache length of 60 instead of 30, since 
almost all modern Intel and AMD processors 
have 56+ elements in their queues.

On Sun 10/12/17 13:50 , "Florian Klämpfl" 
florian at freepascal.org sent:
> Am 10.12.2017 um 02:29 schrieb J. Gareth 
> > Hi everyone,
> > 
> > Since I'm masochistic in my desire to 
> and improve the Free Pascal Compiler, I 
would like to add 
> > some vectorisation support in its 
> cycle, since that is one thing that many 
other compilers 
> > attempt to do these days.  But before 
I begin,
> does FPC support any kind of 
vectorisation already?  If it 
> > does I haven't been able to find it 
yet, and I
> don't want to end up reinventing the 
> I started once to work on this, but 
never merged it into fpc trunk, it
> might be even only in my
> local git check out, I can look for it.
> > 
> > I'm sure it's a mammoth task, but I 
would like
> to start somewhere with it - however, 
are there any design 
> > plans that I should be adhering to so 
I don't
> end up designing something that is 
> > 
> Well, basically it means that another 
pass (like e.g. unroll_loop in
> optloop.pas) of the tree must
> be added which generated operations as 
they can be encoded by -Sv. To do
> this efficiently, probably
> some previous simplification of the tree 
is needed. But this is something
> for later.
> fpc-devel maillist  -  fpc-
devel at lists.freepascal.org
> http://lists.freepascal.org/cgi-

More information about the fpc-devel mailing list