[fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM
Simon Kissel
simon.kissel at nerdherrschaft.com
Fri Nov 23 20:44:30 CET 2018
Hi Florian,
> Actually, most of the improvements so far are no related to
> threading. In particular r40339 helped a lot, it was a bug
> fix: the compiler assumed that a certain sub expression was written
> while it not was and this prevented CSE.
Even better, that means there is still gold to be uncovered :)
In our case the bottleneck very clearly appears to be that
every call to fpc_pushexceptaddr/fpc_popaddrstack causes a
call to CRelocateThreadVar, which causes a call to
pthread_getspecific.
We do create our ARM production builds with {$IMPLICITEXCEPTIONS OFF}
to get acceptable speed, else it would be completely unbearable.
BR,
Simon
More information about the fpc-devel
mailing list