[fpc-devel] Successful implementation of inline support forpure assembler routines on x86
Jonas Maebe
jonas at freepascal.org
Sun Mar 17 19:57:25 CET 2019
On 17/03/2019 18:18, J. Gareth Moreton wrote:
> Part of it may be preference but I think
> some people like the fine degree of
> control that assembly language offers,
That is absolutely correct. That is both its strength and its weakness.
The weakness is that it is impossible to integrate such code safely in
compiler-generated code without the programmer saying exactly what that
code does (in terms of constraints, like GCC supports:
https://gcc.gnu.org/onlinedocs/gcc/Constraints.html ).
E.g., at least the following issues exist with your patch, but that's
not because your code is of bad quality. It's simply that it is
impossible to fully analyse inline assembly and determine it to be safe:
* you forbid modifying the stack, but loading the stack pointer in
another register and then modifying the stack through this other
register is not caught (or e.g. loading a value from memory that happens
to point to the stack)
* you skip over db/dw/dd/dq directives, even though these can also be
used to encode instructions (often ones not (yet) supported by the
compiler). There may be more assembler directives like that that could
influence the code.
Additionally, your remark regarding memory barriers is a bit dangerous:
these instructions must not only act as memory barriers to the
processor, but also to the compiler. I.e., the compiler must not be
allowed to optimise certain things across such a barrier (e.g. (re)move
memory reads or writes), because then the barrier will no longer serve
its purpose. That is the main reason why marking them as "they change
everything" should probably stay for the foreseeable future.
The performance overhead of memory barriers is also many times greater
than that of a call/return, so I don't think it will actually matter
that much (although it would still be better to get rid of the
call/return than not, of course -- provided the compiler can be told to
not optimise anything across it).
I thought I sent a mail in the previous thread about this, but I can't
find it anymore so maybe I did not. What I though I said before, is that
I think that inlining pure assembler functions is something that should
never be done. A pure assembler function, especially with
"nostackframe", is the programmer literally telling the compiler "you
have absolutely no business messing with this code".
On the other hand, if you have a regular function with an inline
assembler block, then inlining becomes a whole lot more feasible.
Especially if you add support for GCC-like constraints. Then there is no
issue with the assembler code expecting arguments in certain registers,
possibly returning in the middle of the block, messing up the stack etc,
because you simply cannot do that in this scenario. This means you don't
have to (try) to check for this either. And there is already rudimentary
support for specifying constraints in this case (which registers get
modified).
It would be much less of a quick win (e.g. because the compiler does not
support passing variables in registers to assembler blocks right now),
but in the long run it would be fully supportable and much more
maintainable. It would also require much less target-specific support,
because it would not require trying to figure out what the assembler
block is doing.
That said: for optimal performance, you will usually still want
intrinsics rather than inline assembly, simply because the compiler can
then be taught to reason about them, and perform constant propagation
through them (and potentially eliminate them altogether).
Jonas
More information about the fpc-devel
mailing list