[fpc-devel] inline... and philosophy
J. Gareth Moreton
gareth at moreton-family.com
Fri Nov 8 04:01:01 CET 2019
Hi everyone,
This is probably more rant-like than it's supposed to be, and maybe a
bit philosophical, especially after I ran my mouth in the jump
optimisation issue.
So I'm wondering what the future will be for this directive, since it
seems to be very divisive. On one side, its use is questionable with
proposals for 'auto-inline', while on the other side some live by it for
minor performance gains. There's always going to be unexpected uses for
things, and one point of contention that was recently raised is the way
I used it on a function that I knew was only called in one particular
place and would stay that way for the forseeable future. Though I wrote
a comment to explain what I did, apparently that's not a valid reason or
use for it.
I guess that's also an interesting argument in terms of purity of
purpose... do you go by what something was designed to do or what it
actually can do? There's an argument for both - the approach of purity
yields cleaner code but it might suffer with performance, and the
approach of practicality is potentially more efficient, but may be much
more difficult to maintain.
Granted, regardless of one's stance, one does have to abide by a
project's coding standards, even if they are a little nebulous sometimes
and one doesn't often know what they are until they're violated. I
apologise for getting so upset over the presence of a handful of
directives in my patch - to explain, there were a couple of new methods
related to my jump optimisations that were marked as inline (so long as
DEBUG_JUMP wasn't defined) - each function was only called in exactly
one location with no intention of calling them elsewhere. It was set up
this way to compartmentalise the code, separating the jump optimisations
from the main loop of the Peephole Optimizer. Florian took issue with
the inline directives despite the comments explaining why they were
there, since it wasn't exactly to the spirit of what inline is used for
(leaf functions that compile into only a few machine code
instructions). I wasn't too happy about the argument that one of the
functions MAY get called a second time in the future - possibly, but you
can remove the inline directive when that time comes (they'd be looking
up the function header at the very least to see what the parameters are).
I guess sometimes, especially when you've worked on some closed-source
applications, you come across the occasional 'black magic' (for a good
real-world example, look up "fast inverse square root") that takes a
University thesis to explain! Depending on where you work, some are
quite strict with coding standards and how many changes you squeeze into
a single commit, while others a bit more cavalier... which leads to the
minefield that's legacy code! For me personally, I've always pushed for
performance, hence why I have no problem dropping into assembly language
on some critical leaf functions, but I do want to go out of my way to
explain what I'm doing so another programmer can follow what's happening
and improve on it if needs be.
To go back to the original subject, what are the intentions with inline?
I've gotten the impression from Florian that he doesn't like the
directive and would much rather leave it to the compiler to auto-inline
short functions, which I guess is fair, and can potentially make better
judgement calls on a per-platform basis (e.g. vector addition on AMD64
platforms collapses into a single line of machine code (not counting
memory moving), whereas a platform that doesn't have vector registers
may end up with something much longer). As for my example,
auto-inlining a long function that only gets called from one location,
that is certainly possible if functions are reference-counted, but
something tells me it would require whole-program optimisation,
especially if said function is public (or protected).
I don't trust a compiler to produce the most optimal code, whether it be
Free Pascal, Delphi or a C++ compiler, and if I am greatly concerned
about execution speed, I look at the disassembly (I am aware that most
mainstream programmers don't do this). In most cases, it's inefficient
high-level code that simple refactoring can fix (to use an exaggerated
example, changing a sorting algorithm to use quicksort instead of
bubblesort), while other times there is little you can do from the code
alone. I do try to help the compiler though by giving hints like the
inline directive. To branch into a semi-related issue of the compiler
deciding what's best... when I get pure functions working, one could
argue that the compiler should be smart enough to determine if a
function is pure or not, and hence would have no need for a 'pure'
directive - this is true, but would also be prohibitively slow, since
the compiler would be analysing every node in every function with a fine
tooth-comb. That is partly my concern with auto-inline as well...
though there's an upper limit on the number of nodes before it decides
to not inline it, it still has to analyse the nodes one-by-one, and even
counting them requires iterating through what is effectively a linked list.
I know originally that one of Turbo Pascal's selling points was the
speed of its compiler compared to other compilers of the day. I don't
know if compilation speed is in Free Pascal's manifesto, but I like to
think it is, hence my acceptance of compiler hints like 'inline',
whereas auto- features and anything requiring whole program optimisation
(even though that's a Release Build thing) are anything but fast.
Speaking of whole program optimisation, it always seems very fiddly to
set up, to the point that the FPC bootstrapper needs a script to get it
working. Not exactly user-friendly and practically demands learning a
separate skill to get working (at least I've struggled). Shouldn't the
compiler have an option to do the two stages of whole program
optimisation (generate the information files, then use the information
files) in one sitting? It would take a long time to complete, but
advertising it as a Release Build thing should prevent most people from
running it accidentally. I personally like seeing file sizes drop and
execution speed climb in my compiled binaries!
Gareth aka. Kit
More information about the fpc-devel
mailing list