[fpc-devel] Peephole optimizer tai class change proposals

J. Gareth Moreton gareth at moreton-family.com
Sun Oct 3 14:04:01 CEST 2021


Hi everyone,

So as my optimisations get more and more sophisticated and intelligent, 
I'm realising that I may need ways to store more information than is 
currently possible.  Obviously I want to avoid enlarging the internal 
state too much or making the code unwieldly, but the additions I have in 
mind are for tai objects:

- A connected tai object
- A flags field or some kind of weighting field

The connected tai object would be useful for jumps that branch to labels 
that were created during peephole optimisation, since these can't be 
looked up with "getlabelwithsym" and so lose out on some subsequent 
optimisations that look across jumps.

In regards to the weighting, a flag that indicates "desire to delete" or 
something similar would be very useful in deciding if temporarily making 
the code worse will have a larger payoff later on.  Consider the 
following code snippet (in the Classes unit under -O4 during some of my 
personal experimentation):

     ...
     movq    %rdx,%rsi
     movq    %r8,%rdi
     testq    %r8,%r8
     je    .Lj2408
     call    CLASSES$_$TFPLIST_$__$$_COPYMOVE$TFPLIST
     movq    %rdi,%rdx
     jmp    .Lj2409
     .p2align 4,,10
     .p2align 3
.Lj2408:
     movq    %rsi,%rdx
.Lj2409:
     movq    %rbx,%rcx
     call    CLASSES$_$TFPLIST_$__$$_MERGEMOVE$TFPLIST
     ...

There's an option to insert "cmoveq %rsi,%rdx", mirroring the "movq 
%rsi,%rdx" instruction (between the two labels), before "je .Lj2408", 
and change that conditional branch to "je .Lj2409".  By itself, this 
wouldn't be a good optimisation because CMOV is slightly slower than MOV 
and it would break the macrofusion betwen the TEST and JE instructions, 
as well as increasing code size and overall instruction count.  However, 
it would cluster the two labels and .Lj2408 only has a single reference; 
if it and the MOV instruction were removed, the entire block between 
"jmp .Lj2409" and its label would collapse, erasing the original MOV, 
and then a subsequent pass will notice that "cmoveq %rsi,%rdx" is 
unnecessary because %rdx = %rsi already at that point in the program 
flow, thus restoring the macrofusion.  At this point, the optimal code 
would be:

     ...
     movq    %rdx,%rsi
     movq    %r8,%rdi
     testq    %r8,%r8
     je    .Lj2409
     call    CLASSES$_$TFPLIST_$__$$_COPYMOVE$TFPLIST
     movq    %rdi,%rdx
     .p2align 4,,10
     .p2align 3
.Lj2409:
     movq    %rbx,%rcx
     call    CLASSES$_$TFPLIST_$__$$_MERGEMOVE$TFPLIST
     ...

Given that .Lj2409 only has a single reference from the nearby 
conditional branch, it may be beneficial to remove the alignment hints 
at this point too, although that one is more up in the air. 
Nevertheless, if the peephole optimizer, while scanning the conditional 
jump, sees that the destination label only has a single reference and 
has been flagged 'desire to delete', it will be more likely to perform 
the optimisation.  However, if the flag is clear, it won't optimise.  As 
for when to set that flag, it will probably be when the peephole 
optimizer scans the original "jmp .Lj2409" instruction and sees how 
close it is to being a zero distance jump, with only alignment fields, 
another label and a single MOV before it finds .Lj2409.

I'm aware that the tai class declares an "optinfo" field, although I'm 
uncertain if this is safe to use or not given it's wrapped by an 
conditional define and is of type Pointer.  I could use optinfo to store 
a reference to the tai_label object in a jump instruction - would this 
be an acceptable use of it?  I'm not sure about using it for a 'desire 
to delete' flag though, especially where there might be situations where 
other tai objects could carry it... maybe even a jump instruction.

This is more of a running commentary of my design thoughts, so I 
apologise for the info dump, but to summarise, could I add a 'connected 
object' and 'flags' field to the tai class, and/or can I use the optinfo 
field for a similar purpose?

Remember I can always refactor later if a feature could be cleaned up or 
restructured in some way.

Gareth aka. Kit


-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



More information about the fpc-devel mailing list