[fpc-devel] Thoughts: Make FillChar etc. an intrinsic for specialised performance potential
J. Gareth Moreton
gareth at moreton-family.com
Tue Apr 19 16:33:44 CEST 2022
At least for me, I think most programmers know not to use FillChar to
fill a managed type, but it is a good point to be careful.
Zeroing a register and moving it into a block is usually the fastest way
to go about such things, but FillChar doesn't assume the input is zero,
whereas if it were treated as an intrinsic, such an optimisation can be
made.
I'm not sure why Free Pascal would use REP MOVS to fill a buffer, since
that's for moving data from one buffer to another of the same size (e.g.
what Move sometimes uses), whereas REP STOS writes the value in
AL/AX/EAX/RAX to a destination buffer, so of course REP MOVS is going to
be slower.
Gareth aka. Kit
On 19/04/2022 14:26, Stefan Glienke via fpc-devel wrote:
> I would rather give a hint to change the code - because only for unmanaged types the semantic of FillChar is equal to default().
>
> As for performance - I just ran some benchmarks comparing the way FPC does it and how Delphi does it. xor a register and mov that into place (Delphi way) is faster in all cases I tested. Also once the record is large enough FPC uses rep movs, Delphi does rep stos. rep stos has faster timings that rep movs so I cannot see how it could be faster given the initial overhead to produce the empty temp.
>
>> On 19/04/2022 14:49 J. Gareth Moreton via fpc-devel <fpc-devel at lists.freepascal.org> wrote:
>>
>>
>> Interesting - I wasn't aware of this intrinsic! I'll make a note of
>> that one.
>>
>> It might be useful to transform FillChar calls to the Default intrinsic
>> at the node level.
>>
>> Gareth aka. Kit
>>
>> On 19/04/2022 12:43, Stefan Glienke via fpc-devel wrote:
>>> You are the expert but I am not sure how that can be the case given you only need to zero a register and blast that into the record location opposed to twice as many mov operations being generated that I have seen with the record that Gareth originally posted.
>>>
>>>> On 19/04/2022 13:37 Sven Barth via fpc-devel <fpc-devel at lists.freepascal.org> wrote:
>>>>
>>>>
>>>> Stefan Glienke via fpc-devel <fpc-devel at lists.freepascal.org> schrieb am Di., 19. Apr. 2022, 12:38:
>>>>> If you want to zero small records more efficiently it might be better using Default(t) for that and looking at optimizing the code the compiler generates for that as it seems it produces an empty temp variable which it assigns instead of simply zeroing the record variable where default() is being assigned to.
>>>> This was an explicit design choice I made, because it pays of as soon as a second such assignment for the same type is made.
>>>>
>>>> Regards,
>>>> Sven
>>>> _______________________________________________
>>>> fpc-devel maillist - fpc-devel at lists.freepascal.org
>>>> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
>>> _______________________________________________
>>> fpc-devel maillist - fpc-devel at lists.freepascal.org
>>> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
>>>
>> --
>> This email has been checked for viruses by Avast antivirus software.
>> https://www.avast.com/antivirus
>>
>> _______________________________________________
>> fpc-devel maillist - fpc-devel at lists.freepascal.org
>> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
> _______________________________________________
> fpc-devel maillist - fpc-devel at lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
>
More information about the fpc-devel
mailing list