[fpc-devel] Prototype optimisation... Sliding Window

J. Gareth Moreton gareth at moreton-family.com
Thu Feb 17 20:25:40 CET 2022

Hi everyone,

So I've started experimenting with a new technique in the peephole 
optimizer for x86 platforms that I've named the Sliding Window.  The 
intention is to use it to help replace common blocks of code within a 
procedure, such as pointer dereferences.  So far I'm having a degree of 
success, although it's pre-alpha currently.

In the aasmcnst disassembly - before:

     movq    %rax,%rdx
     movq    U_$AASMDATA_$$_CURRENT_ASMDATA(%rip),%rax
     movq    224(%rax),%rcx
     movq    U_$AASMDATA_$$_CURRENT_ASMDATA(%rip),%rax
     movq    224(%rax),%rax
     movq    (%rax),%rax
     call    *232(%rax)


     movq    %rax,%rdx
     movq    U_$AASMDATA_$$_CURRENT_ASMDATA(%rip),%rax
     movq    224(%rax),%rcx
     movq    (%rcx),%rax
     call    *232(%rax)

The peephole optimizer replaces the second " movq 
U_$AASMDATA_$$_CURRENT_ASMDATA(%rip),%rax" and the "movq 224(%rax),%rax" 
instruction with "mov %rcx,%rax", which is then removed by a 
conventional peephole optimisation because %rax is completely 
overwritten on the next instruction ("mov (%rcx),%rax").  I sense I 
might be onto a winner here!

I do have a question though... are there any tai objects that signal a 
memory fence or a synchronisation hint so it doesn't optimise a repeated 
reference? I know instructions like MFENCE can kind of force it.

Gareth aka. Kit

P.S. The term "sliding window" comes from the LZ77 compression algorithm 
and is used to track repeated sequences in ZIP files, among others.  
This prototype optimisation essentially uses the same construct, but 
built for instructions instead of individual bytes.

This email has been checked for viruses by Avast antivirus software.

More information about the fpc-devel mailing list