[fpc-devel] x86_64 Optimizer Overhaul

Marģers . margers.roked at inbox.lv
Wed Dec 12 18:50:45 CET 2018


 

----- Reply to message -----
Subject: Re: [fpc-devel] x86_64 Optimizer Overhaul
Date: 2018. gada 12. decembris 17:02:02
From:  J. Gareth Moreton <gareth at moreton-family.com>
To:  FPC developers' list
<fpc-devel at lists.freepascal.org>
> By the way, what generates that set of
> operations? I'm curious because I want to
> see what's going on in the compiler. You
> see, "incq" and that "mov, add, mov" set
> aren't equivalent; anything over
> $100000000 gets truncated with the set,
> but not with "incq", although it's not a
> concern if only the lower 32 bits are
> used.

Have to agree, it's not equivalent. I added
example program for you to examine this situation.
It might and might not be an error. 
note: i use compiler parameter -O4

> If both combinations run at about the same
> speed, then "incq" is better just on
> account of code size.
I spent some time to examine "incq mem" and "mov
add mov"
On my particular cpu if "incq" is independent
instruction, then actual performance is 1 clock
cycle. 
Combination of "mov add mov" ended up like 1  -
1.2 clock cycles. Chain of "mov add mov" was
always few clocks more than the same length chain
of "incq".
But in case if "incq" fall into sever dependency
chain then "incq" executes 25% worse than "mov add
mov".
"incq" 4,5 clock cycles 
"mov add mov" 3,8 clock cycles

I vote for shorter code and prefer "incq" 

margers

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ohoulinc.pas
Type: text/x-pascal
Size: 881 bytes
Desc: not available
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20181212/3326225a/attachment.pas>


More information about the fpc-devel mailing list