[fpc-devel] Proof of concept. Inline support for pure assembler routines on x86
J. Gareth Moreton
gareth at moreton-family.com
Tue Feb 12 12:43:10 CET 2019
This is something I've been researching for a while, the ability to inline
procedures that are written in pure assembler, and I've got something
working pretty well and I'd like to showcase it.
It's something that's garnered a little uncertainty from others because of
how easy it is to introduce compiler bugs and offer support on other
platforms, for example. Currently, my code is restricted to i386 and
x86_64 because that's all I can actually test on. Nevertheless, the new
virtual methods will easily allow extension to other CPUs while blocking
"inline" on pure assembler routines by default.
There are a number of restrictions on what can and can't be inlined,
specifically:
- The routine must have the "nostackframe" directive.
- You cannot write to the stack.
- No parameters or return values must be on the stack.
- You cannot write to a non-volatile register.
... among a few others. The internal procedure checks commands against
the "InsProp" array (although a number of the opcodes just have "Ch_All"
specified, which my code assumes to mean that everything is modified, hence
it marks the procedure as 'cannot inline').
I've so far built this on x86_64-win64, i386-win32 and x86_64-linux (I
couldn't do i386-linux due to problems with missing tools, but it gets
quite far in the compilation otherwise - if someone can do a more strenuous
test, I'd be grateful) and done some tests with internal functions and some
showcase functions, with promising results.
Some other things that the inlining routine does:
- If jumps and labels are found, new local ones are generated.
- If RET is found, it is changed to a JMP and a new destination label
generated at the end of the inserted code.
- The markers at the beginning and end are removed, so peephole
optimisation is actually performed on the inserted code - this is mostly to
address some inefficiencies that crop up from moving parameters into the
expected registers (e.g. the first integer parameter into RCX under Win64).
To Florian, I know this work is somewhat unsanctioned, but I would like to
show that it can be done in a way that's clean. Even if this is still a
definite no, well, I managed to find and squash bug #35065!
Another e-mail will follow this one that adds a patch that inlines some
RTL routines, and a small test program.
Gareth aka. Kit
NOTE: Make sure you have the current trunk, because this code triggered
Internal Error 200208181 due to a bug in "tai_cpu_abstract.ppuload" that
was only fixed this morning.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20190212/3c8151ba/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: x86-inline-assembler-proof-of-concept.patch
Type: application/octet-stream
Size: 28444 bytes
Desc: not available
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20190212/3c8151ba/attachment.obj>
More information about the fpc-devel
mailing list