[fpc-devel] x86_64.inc CompareByte
Florian Klämpfl
florian at freepascal.org
Sun Oct 22 20:55:19 CEST 2017
Am 21.10.2017 um 01:24 schrieb Markus Beth:
> Find attached the already announced version of CompareByte.
>
What benchmark did you use? In my tests it is slightly slower than that one of fpc 3.0.x?
I used the following test program:
var
buf1,buf2 : array[0..127] of byte;
pos,len,i,j : longint;
begin
for i:=1 to 100 do
begin
len:=random(100);
for j:=0 to len-1 do
begin
buf1[j]:=random(256);
buf2[j]:=random(256);
end;
for j:=0 to random(10) do
buf2[j]:=buf1[j];
for j:=1 to 1000000 do
CompareByte(buf1,buf2,len);
end;
end.
>
>
> On 16.10.2017 23:08, Markus Beth wrote:
>> On 16.10.2017 22:41, Florian Klämpfl wrote:
>>>> P.S.: I am currently working on another version of CompareByte that might have a slightly higher
>>>> latency for very small len but a higher throughput (2 cycles per iteration vs. 3 cycles on an Intel
>>>> Arrandale CPU (Westmere microarchitecture)). But this would need some more testing and
>>>> benchmarking.
>>>> I can come up with it here again if this would be of any interest.
>>>
>>> Small lengths in terms of matching string or overall lengths?
>>
>> It is small length in terms of matching string as there is some setup work before the loop.
>>
>>> BTW: I would really like to see a PCMPSTR based implementation :)
>> PCMPSTR is (at the moment) out of my scope. I thought PCMPSTR is part of SSE4.2. How would you
>> deal with Intel core microarchitecture CPUs that don't have it?
>
>
> _______________________________________________
> fpc-devel maillist - fpc-devel at lists.freepascal.org
> http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
>
More information about the fpc-devel
mailing list