[fpc-devel] inline... and philosophy
J. Gareth Moreton
gareth at moreton-family.com
Sat Nov 9 02:46:21 CET 2019
That makes a lot of sense. The jump sizes and indices for local
variables is a big one. On Intel processors, when generating addresses,
if the byte displacement fits into a signed byte (-128 to 127), then
said displacement only takes a byte to store... outside of that range it
uses 4 bytes. Similarly with jumps.
I'm glad you like the jump optimisations. I just released version 3 of
my patches a few hours ago - hopefully all is okay now. There seems to
be a significant improvement in the compiler speed as a result, which I
honestly didn't expect.
By the way, what is your 'particular CPU'? If it's not Intel-based,
would you be willing to test the patches on other platforms? I'm only
able to run the test suite on a handful of i386 and x86_64 platforms, so
I'm not certain how the optimisations perform. Also, I can't guarantee
that my 'condition_in' functions are optimal on non-Intel platforms.
I'm fairly sure they're not incorrect, but they still need testing and
confirmation, and in the case of PowerPC, need expanding since I don't
know how the condition flags work on that architecture.
Gareth aka. Kit
P.S. If something doesn't require philosophy and can be theoretically
calculated, like inlining and outlining, I want to work it out!
(Although some algorithms take far too long to be practical, hence why I
don't plan to implement an 'auto-pure' feature)
On 09/11/2019 01:24, Marģers . via fpc-devel wrote:
> blobing - i meant unnecessarily increase in size, that function loses good shape. There is no such word "blobing" in English. My bad.
> let me periphrases 'just wrong' - 'questionable right'. Currently inlining are left in hands of programmers. And it is abused as magical performance booster. For small function it's must likely true, for larger function it's questionable.
> 1) it might increase index size for accessing local variables on stack.
> 2) it might increase jump instruction size
> 3) it changes code location (code cross page boundaries). For my particular cpu there are 64 byte code page. If loop can fit in it, speed is twice as it overlaps even one byte over page boundary. Jumping forward is ok (as expected code flow is always forward). And there is lager page few kb - calling outside - small penalty. As fpc do not manage this any how, it's just pure luck. It just might get unlucky. Code align generally do not solve thous things.
> Conclusion: by naked eye one cannot tell inline is any good or not. Inline or not to inline is nothing to do with philosophy, it has to be calculated (as clang does and fpc don't).
> I'm looking forward for jump optimization to be accepted.
> fpc-devel maillist - fpc-devel at lists.freepascal.org
More information about the fpc-devel