[fpc-devel] Inlining problem and LEB128
J. Gareth Moreton
gareth at moreton-family.com
Sun Jun 23 01:10:52 CEST 2019
This one is for Jonas in particular, but also to address a minor issue
in general.
Currently, you can get away with specifying the "inline" directive for a
function in just the implementation section of a unit, and the function
will be inlined. However, in some situations, this can cause compiler
crashes which I believe is due to a mixture of the function no longer
being inlined and the internal CRCs not changing, possibly due to a
collision. While collisions are ultimately unavoidable (for 32-bit
hashes like CRC32, the chance of a collision occurring is 1 in 82,137...
look up "Birthday attack" for more information), its effects can be
mitigated with careful design.
Internally, inlined functions store a copy of their node tree in the
compiled PPU file, and the pointer to this tree is nil if the function
is not inlined, or was not able to be inlined (e.g. because of a
recursive call). The pointer is not always checked though, instead
relying on a flag, and hence it's possible to trigger an
EAccessViolation exception. I may not have the details quite correct
though - maybe Jonas can clarify.
The most robust solution, and the one originally proposed, is to demand
that "inline" appears in the interface section, but this breaks a number
of packages and RTL units and also hurts some cross-platform
optimisation, where functions like SwapBuffers can be efficiently
inlined on some platforms but not on others (by specifying "inline" in
the appropriate implementation).
My solution to this is as follows... ALWAYS attempt to create a node
tree (unless "noinline" is specified). This is effectively
auto-inlining, but routines are not inlined unless the explicit
directive instructs the compiler to do so (or something like -O4 is
specified and the compiler determines that a function should be inlined
due to its shortness, say). This would help mitigate the problem of the
null node tree by causing the compiler to check said node tree and set
the "cannot inline" flag when appropriate.
This isn't ideal though, as it would have two notable side-effects:
- General compilation will be slightly slower because of extra analysis
on the node trees and copying them into the "inlined nodes" pointer.
- The compiled PPU files will increase in size significantly.
Some care with the design might be able to mitigate the slower
compilation, such as not duplicating objects unless absolutely necessary
(such as only when actually inlining a function into a caller, which
will only occur if "inline" is specified or auto-inlining enabled) - the
second issue is a bit trickier, and this is where the second subject
comes into play.
You may remember that a few months ago, I attempted to shrink the size
of PPU files by storing particular fields in LEB128 format, but this was
suspended because it increased quick compilation time, despite it
reducing the size of PPU files by over 10% (without using explicit
compression algorithms). However, preliminary tests on large projects
(we used Lazarus as a test case) showed that the compilation time was
about 3 seconds as compared to 2 seconds, while full compilation times
were not adversely affected because the processing required to encode
LEB128 values is vastly overshadowed by the compilation process in
general. Given it's unlikely that a user will quick-compile a project
more than once a minute while coding, I would like to revisit this
encoding proposal as a possible solution to the excessive PPU sizes that
will result. While storage is not a sacred commodity on desktop
computers as it once was in the 80s and 90s, it is still a premium on
some embedded platforms.
Gareth aka. Kit
P.S. If I've got my ideas of the problem completely incorrect, I humbly
apologise, but hopefully I will come to properly understand the issue
and a solution found.
P.P.S. Personally I actually like "inline" being in the implementation
section because it's adjacent to the code that will actually be inlined
and hence can be added or removed at the programmer's discretion based
on what the function is doing, and as mentioned above, has a use in
cross-platform optimisation.
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
More information about the fpc-devel
mailing list