[fpc-devel] Successful implementation ofinlinesupportforpureassembler routines on x86

Tue Mar 19 00:34:01 CET 2019

On 18/03/2019 22:07, J. Gareth Moreton wrote:
> On Mon 18/03/19 20:23 , "Jonas Maebe" jonas at freepascal.org sent:
>>      Similarly, replacing return instructions with jumps is something 
>> you cannot do.
> 
> Why not?

Because parsing and trying to understand the meaning of inline assembly 
is something that is not supportable. Not even the most advanced 
disassemblers and reverse-engeneering tools can do this with 100% certainty.

You can add a lot of fallbacks and safety checks, but even if you manage 
to get those airtight in the end, then the feature will have a bunch of 
limitations that are not necessarily foreseeable by users. This will 
lead to more irritation and bug reports than knowing in advance that 
something is not possible at all, because programmers will first waste 
time on trying to get it to work.

Additionally, if you support parsing the inline assembler code on one 
platform, you have to support it on all platforms. That is simply how 
FPC gets developed. "Most users use platform X so we don't need to 
support it for the rest" (or at least consider in advance how well it 
can be supported for the rest) is not how we do things.

> True, this "inline" for pure assembler procedures takes it one step 
> towards a higher-level language, but this is another reason why what you 
> can do is limited.  In terms of control and power, I put it between 
> true, raw assembly language and intrinsics, since you're limited to 
> volatile registers but have much more control on where temporary values 
> are stored.  I simply wish to develop a tool for programmer who need 
> that raw speed and control using semantics that already exist, while 
> making it as safe as possible.  I myself have a number of uses for the 
> feature, and I can see where the RTL can benefit too.

I don't dispute that there are uses for this functionality. However, I 
disagree that this is the best, or even a good, way to address those needs.

> By the way, I updated the patches to patch the trick of assigning RSP to 
> another register and then deferencing that, and literal byte values via 
> DB etc are now forbidden.  I do invite you to try to break it.  (It is 
> overly conservative though... if you do something like MOV RAX, RSP 
> followed by MOV RAX, RDX (so it no longer depends on RSP) then 
> derefencing RAX, the compiler will still consider it writing to the 
> stack and forbid inlining - after all, you're obviously trying to do 
> something unusual)

You can still write that register to memory and then load the value in 
another register from memory.

> I hope that our philosophies don't conflict.  I'm worried.  I want to 
> find the balance between tradition and revolution, for lack of a better 
> term.  If you're not allowed to do something, or something must be done 
> in a particular way, I ask "why?".

Several reasons have been given in this thread already:
* many of these things can be done equally as well or better using 
(potentially cpu-specific) intrinsics. Additionally, intrinsics have the 
advantage that they can generate different code at compile time 
depending on the selected target processor (or give an error if the 
target processor does not support them) rather than crash at run time, 
and since they integrate in the code generator they can for better 
overall code quality. Not today, but definitely in the future. With 
inline assembler, especially inline pure assembler routines, you will 
forever be limited to hardcoded registers. The compiler is almost 28 
years old by now. While in the beginning a lot of short term quick-win 
things were implemented, we have paid the price for those over the years 
in terms of maintenance, have to rewrite them, having to support them 
when adding support for new architectures/platforms etc.
* parsing assembly code and trying to understand what it does, or at 
least figuring out it does not do anything dangerous, is insanely 
difficult. At the same time, all of those limitations will reduce the 
possible use cases for the result
* allowing optimizers to interact with inline assembly (apart from 
constraints-related values) must never be done, because it is impossible 
to guarantee that the result will be correct.
* my previous explanations for why allowing inlining of plain Pascal 
procedures with assembler blocks is more logical, safer (does not need 
any extra checks), portable (because no target-specific code other than 
saving/restoring the tai's from/to ppu-files) and maintainable.

Jonas