[fpc-devel] Question about memory alignment (again!)

J. Gareth Moreton gareth at moreton-family.com
Wed Aug 17 14:12:24 CEST 2022


That is indeed the case, yes - thanks for pointing it out (it also sets 
the zero flag if the register's final value is zero). In most cases, the 
code generator doesn't produce assembly language that uses the flags 
directly from SHR - instead it uses CMP and TEST instructions for that, 
which are only optimised out in the post-peephole stage.  Nevertheless, 
the existing optimisation does check to see if the FLAGS register is in 
use or not for safety.

That aside, might the memory alignment cause performance problems?

Gareth aka. Kit

On 17/08/2022 10:03, Martin Frb via fpc-devel wrote:
> On 17/08/2022 02:21, J. Gareth Moreton via fpc-devel wrote:
>> Hi everyone,
>>
>> Recently I've made some optimisations centred around the SHR 
>> instruction on x86, and there was one pair of instructions that 
>> caught my attention:
>>
>> movl (%rbx),%eax
>> shrl $24,%eax
>>
>> Is it permissible to optimise this to (x86 is little-endian):
>>
>> movzbl 3(%rbx),%eax?
>>
>> (You could also optimise "movl; sarl" into a "movsbl" instruction 
>> this way)
>>
>> Logically the result is the same and it removes an instruction and a 
>> pipeline stall, but will there be a performance hit that comes from 
>> reading an unaligned byte of memory like that?
>
> Doesn't shr set the carry flag to the former bit 23? (the last shifted 
> out)
> So its not the same, unless there is no dependency on the carry flag 
> later on.
>
>>
>> I did make similar optimisation once before with QWords using the 
>> implicit zero-extension of the 32-bit MOV instruction - that is:
>>
>> movq (%rbx),%rax
>> shrq $32,%rax
>>
>> To:
>>
>> movl 4(%rbx),%eax
>>
>> This one is a little nicer though because it's still on a 32-bit 
>> boundary and so was permissible.
>
> Same issue?
> _______________________________________________
> fpc-devel maillist  -  fpc-devel at lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
>


More information about the fpc-devel mailing list