[fpc-devel] Kit's ambitions!

J. Gareth Moreton gareth at moreton-family.com
Mon Jun 4 01:30:17 CEST 2018


 So far, I'm researching the optimisation as listed below... tracking
registers with identical values and changing them to minimise pipeline
stalls.  Because I don't need to keep track of their actual values, just
whether they've changed since a particular MOV instruction, I've managed to
move this into the peephole optimiser as an extension to
TX86AsmOptimizer.PostPeepholeOptMov().

 It's a bit more difficult than it looks though - I've had a lot of crashes
so far when it changes a register when it shouldn't do, but I'm ironing out
the bugs one by one.  To truly see the gains though, one would need to
perform some kind of intense timing comparison.

 This would be the first step in the step-by-step implementation.  More
in-depth deep data-flow optimisation, like successfully merging div and mod
instructions of the same numerator and denominator will require some more
care and thought, especially as the two divison operations may not use the
same registers (if successful though, it will improve the compiler itself,
since it has "x div 1000" and "x mod 1000" side-by-side in a couple of
places, a common pair of expressions to produce a human-readable time
metric, e.g. seconds and milliseconds).
 Gareth aka. Kit

 On Sun 03/06/18 14:12 , Florian Klämpfl florian at freepascal.org sent:
 Am 21.05.2018 um 21:05 schrieb J. Gareth Moreton: 
 > Would you object to me trying anyway, Florian? 

 No, feel free to go ahead, but it needs to be done step by step. 

 > It might be that I run into the same problems you had and it's too 
 > unsafe, but I'm going by a conservative philosophy in that if it spots
something that it can't work out (e.g. an 
 > instruction that it's not programmed to handle) or is potentially unsafe
(e.g. reading and writing to a block of memory 
 > that it doesn't have control over, due to multi-threading issues), then
it just stops optimising and drops all 
 > assumptions that it has made at that point. 
 > 
 > As a small test case, I'm attempting to see if I can spot and optimise,
for example, "mov %rax, %rbx; lea %rcx, 
 > -8(%rsp); mov %rbx, 8(%rsp)", where a pipeline stall occurs due to a
read-after-write penalty (with %rbx in this case). 

 Things like this are fine, it gets hairy though as soon as memory
locations are involved. 
 _______________________________________________ 
 fpc-devel maillist - fpc-devel at lists.freepascal.org [1] 
 http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[2]">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel 

 

Links:
------
[1] mailto:fpc-devel at lists.freepascal.org
[2] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20180604/7baf5add/attachment.html>


More information about the fpc-devel mailing list