[fpc-devel] Kit's ambitions!
J. Gareth Moreton
gareth at moreton-family.com
Sun May 13 04:28:58 CEST 2018
So for those who have observed me, I can be a little... excitable and
optimistic sometimes with developing for Free Pascal. Sometimes it might
not be for the best, but I'd like to think that what I propose and create
can me implemented relatively seamlessly. To explain, my main speciality
is in optimisation and assembly language, and this is where I feel I can
contribute the most. To explain what I'm currently looking at:
- #0033549 - Add in the x86-64 instructions PDEP and PEXT.
This has been requested by another user and seems straightforward enough
to implement. Florian mentioned that there's a standard test design for
new assembler commands, but I can't for the life of me remember what it is
- can someone fill me in?
- Expand on Data Flow Analysis in the compiler.
What I personally call the "Deep Optimizer", I'm proposing an
assembler-level optimisation system (although it won't touch pure assembler
routines) that rearranges commands and changes registers in order to
minimise pipeline stalls and to also collapse a "div" and "mod" operation
into a single instruction where possible. There are quite a few
situations that can't be caught by the peephole optimiser - there's one
example I've seen with "MOV", "LEA" and "MOV" being generated - if the LEA
wasn't present, the peephole optimiser would make the MOVs more efficient,
but because LEA is in between them, it misses the optimisation and causes a
pipeline stall there due to a read-after-write penalty.
- Research possibility for 'inline' support for certain assembler
For situations where speed is of the highest priority, there are some
internal functions such as Int and Frac that can theoretically be inlined
(a procedure call is quite expensive, around 50 cycles), but because they
are written in pure assembly language, the compiler will never inline
them. I'm still working out quite a bit of theory, but I believe I will
be able to allow the inlining of routines that are leaf functions (don't
have CALLs of their own) and declared as 'nostackframe'. Such a system
would allow the support of 'intrinsics' that can be composed
programmatically rather than as internal routines, though it's not exactly
what Florian is planning. Even if Florian does go for a different approach
for intrinsics, I like to think that such inline support will have uses
elsewhere, especially some of the routines like "GetStackFrame" (I think)
that simply return the value of RSP (if it's 'inline', which it is actually
declared as in the unit, the return value will be far more accurate).
That's all for now. A lot of it is personal research, but if I find any
little gems, I want to share it with the world. I like playing around with
mathematics and number crunching, so speed is highly sought after, and I
like to improve Free Pascal so I can say it's a viable tool for the job (of
course, it already is if you know what you doing).
Gareth aka. Kit
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the fpc-devel