[fpc-devel] FillWord, FillDWord and FillQWord are very poorly optimised on Win64 (not sure about x86-64 on Linux)
Florian Klämpfl
florian at freepascal.org
Wed Nov 1 11:51:21 CET 2017
Am 01.11.2017 um 05:58 schrieb J. Gareth Moreton:
> I also made versions that use memory fences and other checks such as memory alignment in order to gain speed
> - I've converted them to use the System V ABI of Linux as well, but are currently completely untested as I
> don't have the facilities to yet compile on Linux (they are also even smaller in code size because you don't
> need to push and pop RDI, and the destination (var x) is already stored in RDI, thereby collapsing each
> routine to just 3 instructions (not including the REP prefix)).
>
> Would it be worth opening up a bug report for this, with the attached assembler routines as suggestions?
Yes, for sure.
> I
> haven't worked out how to implement internal functions into the compiler yet,
Fill* are not internal functions, so you just have to adapt the system unit.
> and I rather clear it with you
> guys first before I make such an addition. I had a thought that the simple routines above could be used for
> when compiling for small code size, while larger, more advanced ones are used for when compiling for speed.
I would provide only one version, after all, Fill* is only a very small part of the rtl, so shaving
off a few bytes here does not matter and we are not in a 1k contest :)
More information about the fpc-devel
mailing list