[fpc-devel] x86_64.inc CompareByte
Markus Beth
markus.beth at zkrd.de
Mon Oct 16 23:08:12 CEST 2017
On 16.10.2017 22:41, Florian Klämpfl wrote:
>> P.S.: I am currently working on another version of CompareByte that might have a slightly higher
>> latency for very small len but a higher throughput (2 cycles per iteration vs. 3 cycles on an Intel
>> Arrandale CPU (Westmere microarchitecture)). But this would need some more testing and benchmarking.
>> I can come up with it here again if this would be of any interest.
>
> Small lengths in terms of matching string or overall lengths?
It is small length in terms of matching string as there is some setup
work before the loop.
> BTW: I would really like to see a PCMPSTR based implementation :)
PCMPSTR is (at the moment) out of my scope. I thought PCMPSTR is part of
SSE4.2. How would you deal with Intel core microarchitecture CPUs that
don't have it?
More information about the fpc-devel
mailing list