[fpc-devel] Vectorization
J. Gareth Moreton
gareth at moreton-family.com
Sun Dec 10 20:01:05 CET 2017
The idea I had currently (this is without
looking at any previous theory) was to use
a kind of sliding window, similar to how
ZIP and other LZ77-based algorithms work
when compressing repeating strings, to
look backwards in the current block for a
matching command and then scan forward. If
the scan gets up to the instruction right
before the starting point, then it's
potential for vectorisable code. Using the
previous example:
movss 16(%rsp),%xmm0
addss 32(%rsp),%xmm0
movss %xmm0,(%rax)
movss 20(%rsp),%xmm0
addss 36(%rsp),%xmm0
movss %xmm0,4(%rax)
Starting at the 4th command, it looks back
to find a match in the 1st command, albeit
with Ann address that differs only by 4.
As it scans forward, it finds similar
matches in subsequent commands, and
eventually realises the entire block could
potentially be vectorised. If it
continues, it finds the code fragment
repeats 4 times and can be vectorised with
little difficulty. Being only SSE commands
helps too.
Kit
P.S. I did look at the loop unrolling
code, but it almost never triggers due to
the small instruction cache that's
assumed. For x86-64, is it safe to assume
a cache length of 60 instead of 30, since
almost all modern Intel and AMD processors
have 56+ elements in their queues.
On Sun 10/12/17 13:50 , "Florian Klämpfl"
florian at freepascal.org sent:
> Am 10.12.2017 um 02:29 schrieb J. Gareth
Moreton:
>
> > Hi everyone,
>
> >
>
> > Since I'm masochistic in my desire to
understand
> and improve the Free Pascal Compiler, I
would like to add
> > some vectorisation support in its
optimisation
> cycle, since that is one thing that many
other compilers
> > attempt to do these days. But before
I begin,
> does FPC support any kind of
vectorisation already? If it
> > does I haven't been able to find it
yet, and I
> don't want to end up reinventing the
wheel.
>
>
> I started once to work on this, but
never merged it into fpc trunk, it
> might be even only in my
> local git check out, I can look for it.
>
>
>
> >
>
> > I'm sure it's a mammoth task, but I
would like
> to start somewhere with it - however,
are there any design
> > plans that I should be adhering to so
I don't
> end up designing something that is
disliked?
> >
>
>
>
> Well, basically it means that another
pass (like e.g. unroll_loop in
> optloop.pas) of the tree must
> be added which generated operations as
they can be encoded by -Sv. To do
> this efficiently, probably
> some previous simplification of the tree
is needed. But this is something
> for later.
>
__________________________________________
_____
>
> fpc-devel maillist - fpc-
devel at lists.freepascal.org
> http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
>
>
>
>
More information about the fpc-devel
mailing list