[fpc-devel] *** GMX Spamverdacht *** Re: Broken frac function in FPC3.1.1 / Windows x86_64

Thorsten Engler thorsten.engler at gmx.net
Sat Apr 28 17:57:59 CEST 2018


> -----Original Message-----
> From: fpc-devel <fpc-devel-bounces at lists.freepascal.org> On Behalf
> Of Florian Klämpfl
> So something like
> 
>   cmp       edx, $43300000
>   jge      @@zero
>   cmp       edx, $3FE00000
>   .align   16
>   jbe      @@skip
> 
> might be much better.

That ended up making things worse in some cases.

Here is a branchless version:

function Frac1(const X: Double): Double;
asm
  .noframe
  movq      rdx, xmm0
  mov       rax, rdx
  xor       rcx, rcx
  shr       rdx, 32
  and       edx, $7FF00000
  cmp       edx, $43300000
  cmovge    rax, rcx
  movq      xmm0, rax
  cvttsd2si rax, xmm0
  cvtsi2sd  xmm4, rax
  subsd     xmm0, xmm4
end;

It performs slightly slower in the "in range" case, noticeable worse in the other 2 cases (as it's exactly the same for all 3).

I would guess that the "in range" case is the most common (you aren't going to call Frac if you know ahead of time that it's always 0 as the number is too big, or if you know that it already is a value between -1 and 1), so the higher cost for the out of range and only fraction cases is probably less important than it might look.

It IS largely independent of code alignment or predictable patterns in the incoming value:

Code address:
Frac1: 0000000000536430 (48)
Frac2: 0000000000536480 (0)
Frac3: 00000000005364D0 (80)
Frac4: 0000000000536520 (32)
Frac5: 0000000000536570 (112)
Frac6: 00000000005365C0 (64)
Frac7: 0000000000536610 (16)
Frac8: 0000000000536660 (96)

1st run:
In range (1e15+0.5):
Frac1 1431794
Frac2 1429232
Frac3 1463357
Frac4 1475042
Frac5 1446016
Frac6 1472979
Frac7 1443244
Frac8 1467528

Out of range (1e16+0.5):
Frac1 1476556
Frac2 1458534
Frac3 1444431
Frac4 1427287
Frac5 1427326
Frac6 1427472
Frac7 1428914
Frac8 1419654

Only fraction (0.5):
Frac1 1470644
Frac2 1475227
Frac3 1447379
Frac4 1529162
Frac5 1509275
Frac6 1485185
Frac7 1500826
Frac8 1524294

Code address:
Frac1: 0000000000536423 (35)
Frac2: 0000000000536458 (88)
Frac3: 000000000053648D (13)
Frac4: 00000000005364C2 (66)
Frac5: 00000000005364F7 (119)
Frac6: 000000000053652C (44)
Frac7: 0000000000536561 (97)
Frac8: 0000000000536596 (22)

1st run:
In range (1e15+0.5):
Frac1 1349334
Frac2 1429198
Frac3 1447011
Frac4 1436476
Frac5 1477058
Frac6 1496887
Frac7 1431293
Frac8 1435460

Out of range (1e16+0.5):
Frac1 1349939
Frac2 1412543
Frac3 1462295
Frac4 1442081
Frac5 1512579
Frac6 1453593
Frac7 1457510
Frac8 1436533

Only fraction (0.5):
Frac1 1371353
Frac2 1443000
Frac3 1437583
Frac4 1415591
Frac5 1474870
Frac6 1437224
Frac7 1452196
Frac8 1453833

Also, it still outperforms Delphi's Frac in all cases.




More information about the fpc-devel mailing list