[fpc-devel] Experimentation: "Branch stitching"

Martin Frb lazarus at mfriebe.de
Mon Nov 28 14:52:34 CET 2022


On 28/11/2022 14:32, J. Gareth Moreton via fpc-devel wrote:
> On 28/11/2022 12:59, Martin Frb via fpc-devel wrote:
>> Well first of all, you didn't move the balign in front of .Lj732
>
> I do move the alignment hints, but if the label becomes dead (due to 
> the zero-distance jump being 'collapsed'), the alignment hint gets 
> removed.  It's an experiment in progress.

Ah, yes right.
Anyway this may be more of a 32 byte thing, and the 16 byte align is at 
best a 50/50 game

I once had a better source on the topic (also it might be in the pdf I 
once sent) but for now:
https://superuser.com/questions/1368480/how-is-the-micro-op-cache-tagged

> Each 32B window (from the instruction cache) is mapped into the uop cache
(in case of an outer loop) Due to the size of that cache depending what 
else is executed, uops may or may not be cached (also only matters if 
the moved block is (inside a loop) frequently entered).
But ultimately, the 16 bytes align are not meant for that. Though if a 
user used a directive to set a 32byte align => then that may matter.




More information about the fpc-devel mailing list