[fpc-devel] Double-checking an optimisation

Florian Klämpfl florian at freepascal.org
Sun Jan 9 13:35:15 CET 2022


Am 09.01.2022 um 01:37 schrieb J. Gareth Moreton via fpc-devel:
> Hi everyone,
> 
> So a merge request of mine was just approved that allows the peephole optimizer access to more registers when it needs 
> one for temporary storage.  It allows it to make an optimisation on x86_64-win64 that wasn't possible before due to the 
> lack of available volatile registers. In packages\numlib\src\dsl.pas - before:
> 
> .Lj184:
>      ...
>      cmpl    $1,%ecx
>      jng    .Lj188
>      subl    $1,%ecx
> .Lj188:
>      ...
> 
> After:
> 
> .Lj184:
>      ...
>      cmpl    $1,%ecx
>      setg    %bl
>      movzbl    %bl,%ebx
>      subl    %ebx,%ecx
>      ...
> 
> %ebx is a non-volatile register, but the current subroutine preserves it and it's not currently in use, so the peephole 
> optimizer can borrow it for a few instructions. >
> I need to double-check though... is this actually a good optimisation for speed?

I think getting rid of jumps is always good as it also reduces branch predictor load. Not too mention that iirc most 
CPUs can handle correctly only one jump per 16 bytes in the branch predictor.

>  It removes a jump and a label, which 
> might permit other long-range optimisations, but it's 3 instructions that are in a dependency chain.

Didn't you implement something which transformed the code above in

       xorl    %ebx,%ebx
       cmpl    $1,%ecx
       setg    %bl
       subl    %ebx,%ecx

?


More information about the fpc-devel mailing list