[fpc-devel] inline... and philosophy
J. Gareth Moreton
gareth at moreton-family.com
Fri Nov 8 17:45:16 CET 2019
Thanks for the explanation. I still have a lot to learn with some
things. I guess when you compare yourself to the behemoths, you're
always going to look sloppy.
Things that come to mind that could be possible when I think about
whole-program optimisation and smart linking:
- Identification of duplicate functions. This is not always as obvious
as code duplication, but when I've compiled the compiler before, I've
noticed that a number of leaf functions compile into the same machine
code, which may be due to types that are identical on a particular
platform, or the elimination of code due to preprocessor directives, for
example. I imagine such identification could be done via a hash table,
and then doing a more thorough check to see if there is an actual match
or an unfortunate collision. Admittedly a number of these routines are
inlined so it might not produce much of a saving in the real world.
- Identifying functions that are only used once. This became a slight
point of contention between Florian and myself, because I inlined a
couple of functions in my jump optimisations that I was absolutely
certain were only called once elsewhere. When a function is only called
once, theoretically there's a slight speed and size saving if the
function is inlined at the call. I figure it would require
whole-program optimisation though because the function call opcodes have
already been implemented, while inserting the raw nodes would yield more
optimal code (better register usage and cancelling out actual parameter
set-up). Theorising an implementation, calls that are 'noinline' or have
something that the compiler flags as 'cannot inline' would not be
optimised in this way, and assembler routines are intrinsically
'noinline' as well, so it covers that use case.
Thanks again for the education on what I don't know everything about!
Gareth aka. Kit
On 08/11/2019 16:15, Sven Barth via fpc-devel wrote:
> J. Gareth Moreton <gareth at moreton-family.com
> <mailto:gareth at moreton-family.com>> schrieb am Fr., 8. Nov. 2019, 14:28:
> On 08/11/2019 13:14, Sven Barth via fpc-devel wrote:
> > ...
> > What's stopping that? Simple: no driving need. It's just work for
> > something that has essentially no gain.
> No gain? Wow, is whole-program optimisation that underperforming?
> the bloated size of FPC's binaries compared to, say, what a
> C++ compiler than do, I would have thought that there could be a lot
> that could be stripped out in regards to unused functions and the
> Unused functions are handled by smart linking. No need for WPO here.
> WPO is needed for devirtualisation for example where the compiler is a
> very good usecase for due to the architecture of the backend. For
> other real world applications your mileage may vary.
> One possible further WPO task would be deduplication of generic
> specializations for the same types (at least unless the target also
> supports comdat sections).
> But all in all WPO isn't used that much in the real world.
> Or am I missing something? The large binary sizes feel like an
> in the room that no-one talks about. What causes them?
> Mainly RTTI and the fact that FPC provides a statically linked RTL.
> Change MSVC to static linking and suddenly you get 300 KB executables
> as well.
> Back when I did the first tests with dynamic packages the chmcmd
> binary only had 20 KB or so, but the necessary package libraries were
> much bigger (and there smart linking and WPO are both much less usable
> as they can only strip stuff that is not exported).
> fpc-devel maillist - fpc-devel at lists.freepascal.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the fpc-devel