[fpc-devel] Attn: J. Gareth // 3.3.1 opt = slower // Fwd: [Lazarus] Faster than popcnt

J. Gareth Moreton gareth at moreton-family.com
Wed Jan 5 06:36:39 CET 2022

It's why I like going for optimisations that try to reduce code size 
without sacrificing speed, because of reducing the number of 16-byte or 
32-byte sections.  Anyhow, back to work with optimising!

Gareth aka. Kit

On 04/01/2022 19:33, Martin Frb via fpc-devel wrote:
> On 04/01/2022 18:43, Jonas Maebe via fpc-devel wrote:
>> On 03/01/2022 12:54, Martin Frb via fpc-devel wrote:
>>> not sure if this is of interest to you, but I see you do a lot on 
>>> the optimizer....
>> It's very likely unrelated to anything the optimiser does or does not 
>> do, and more regarding random differences in code layout. Charlie 
>> posted the following video on core just yesterday, and it touches on 
>> exactly this subject: https://www.youtube.com/watch?v=r-TLSBdHe1A
>> Choice quote: code layout and environment variables can produce up to 
>> 40% differences in performance, which is more than what even the best 
>> optimizing compilers can achieve do in most cases.
> Interesting...
> And yes, see my previous post. It seems to be which "sub-section" of a 
> loop falls into a 32 byte aligned 32 byte block.
> It's not even the entire loop (not about the begin of the loop), but a 
> certain code block within.
> This also goes along with one optimization that (even though still 
> chance) in my test improved the timing (both worst and best time, 
> though those are only *my" worst/best)
> => reducing the byte size of the loop code.
> That way there are less 32byte sections.
> _______________________________________________
> fpc-devel maillist  -  fpc-devel at lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

This email has been checked for viruses by Avast antivirus software.

More information about the fpc-devel mailing list