[fpc-devel] FillWord, FillDWord and FillQWord are very poorly optimised on Win64 (not sure about x86-64 on Linux)

Florian Klämpfl florian at freepascal.org
Wed Nov 1 11:51:21 CET 2017

Am 01.11.2017 um 05:58 schrieb J. Gareth Moreton:

> I also made versions that use memory fences and other checks such as memory alignment in order to gain speed 
> - I've converted them to use the System V ABI of Linux as well, but are currently completely untested as I 
> don't have the facilities to yet compile on Linux (they are also even smaller in code size because you don't 
> need to push and pop RDI, and the destination (var x) is already stored in RDI, thereby collapsing each 
> routine to just 3 instructions (not including the REP prefix)).
> Would it be worth opening up a bug report for this, with the attached assembler routines as suggestions? 

Yes, for sure.

> I 
> haven't worked out how to implement internal functions into the compiler yet, 

Fill* are not internal functions, so you just have to adapt the system unit.

> and I rather clear it with you 
> guys first before I make such an addition.  I had a thought that the simple routines above could be used for 
> when compiling for small code size, while larger, more advanced ones are used for when compiling for speed.

I would provide only one version, after all, Fill* is only a very small part of the rtl, so shaving
off a few bytes here does not matter and we are not in a 1k contest :)

More information about the fpc-devel mailing list