[fpc-devel] x86_64.inc CompareByte

Florian Klämpfl florian at freepascal.org
Sun Oct 22 20:55:19 CEST 2017


Am 21.10.2017 um 01:24 schrieb Markus Beth:
> Find attached the already announced version of CompareByte.
> 

What benchmark did you use? In my tests it is slightly slower than that one of fpc 3.0.x?

I used the following test program:

var
  buf1,buf2 : array[0..127] of byte;
  pos,len,i,j : longint;

begin
  for i:=1 to 100 do
    begin
      len:=random(100);
      for j:=0 to len-1 do
        begin
          buf1[j]:=random(256);
          buf2[j]:=random(256);
        end;

      for j:=0 to random(10) do
        buf2[j]:=buf1[j];

      for j:=1 to 1000000 do
        CompareByte(buf1,buf2,len);
    end;
end.

> 
> 
> On 16.10.2017 23:08, Markus Beth wrote:
>> On 16.10.2017 22:41, Florian Klämpfl wrote:
>>>> P.S.: I am currently working on another version of CompareByte that might have a slightly higher
>>>> latency for very small len but a higher throughput (2 cycles per iteration vs. 3 cycles on an Intel
>>>> Arrandale CPU (Westmere microarchitecture)). But this would need some more testing and
>>>> benchmarking.
>>>> I can come up with it here again if this would be of any interest.
>>>
>>> Small lengths in terms of matching string or overall lengths?
>>
>> It is small length in terms of matching string as there is some setup work before the loop.
>>
>>> BTW: I would really like to see a PCMPSTR based implementation :)
>> PCMPSTR is (at the moment) out of my scope. I thought PCMPSTR is part of SSE4.2. How would you
>> deal with Intel core microarchitecture CPUs that don't have it?
> 
> 
> _______________________________________________
> fpc-devel maillist  -  fpc-devel at lists.freepascal.org
> http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
> 




More information about the fpc-devel mailing list