[fpc-devel] Broken frac function in FPC3.1.1 / Windows x86_64

Sat Apr 28 10:11:39 CEST 2018

Oops, small mistake caused by last minute change (I replaced rol with shl): it needs to be shr (or ror or rol, they all perform about the same on my cpu).

And in case anyone wonders, the first cmp and branch returns 0 for numbers that would cause an integer overflow, and the 2nd cmp and branch skips  the whole thing if the input is between -1 and 1 (so it already is just a fraction).

Also, at least on my CPU (AMD Phenom II X6, so not exactly the newest) the effect of code alignment on performance is huge. (It affects the branch predictor I think.) I’m not sure what the best alignment is for all CPUs. 

.align 16 forces alignment of the function entry point to a multiple of 16. If I add between 0 and 3 nops at the start of the function, the timings for calling it 10 million times are:

In range (1e15+0.5):

0 nop 1266149

1 nop 4260343

2 nop 1369745

3 nop 4469482

Out of range (1e16+0.5):

0 nop 881536

1 nop 896240

2 nop 890805

3 nop 871582

Only fraction (0.5):

0 nop 894850

1 nop 2219469

2 nop 955618

3 nop 2303233

Leaving out the check if it’s already a fraction decreases the time for in-range numbers and increases it for once that are already a faction:

In range (1e15+0.5):

do skip 1306063

no skip 1121395

Out of range (1e16+0.5):

do skip 887081

no skip 888925

Only fraction (0.5):

do skip 903330

no skip 1124026

function FracDoSkip(const X: Double): Double;

asm

  .align 16

  .noframe

  movq      rdx, xmm0

  rol       rdx, 32

  and       edx, $7FF00000

  cmp       edx, $43300000

  jge      @@zero

  cmp       edx, $3FE00000

  jbe      @@skip

  cvttsd2si rax, xmm0

  cvtsi2sd  xmm4, rax

  subsd     xmm0, xmm4

  jmp      @@skip

@@zero:

  xorpd     xmm0, xmm0

@@skip:

end;

function FracNoSkip(const X: Double): Double;

asm

  .align 16

  .noframe

  movq      rdx, xmm0

  rol       rdx, 32

  and       edx, $7FF00000

  cmp       edx, $43300000

  jge      @@zero

//  cmp       edx, $3FE00000

//  jbe      @@skip

  cvttsd2si rax, xmm0

  cvtsi2sd  xmm4, rax

  subsd     xmm0, xmm4

  jmp      @@skip

@@zero:

  xorpd     xmm0, xmm0

@@skip:

end;

Cheers,

Thorsten

From: fpc-devel <fpc-devel-bounces at lists.freepascal.org> On Behalf Of Thorsten Engler
Sent: Saturday, 28 April 2018 15:37
To: 'FPC developers' list' <fpc-devel at lists.freepascal.org>
Subject: Re: [fpc-devel] *** GMX Spamverdacht *** Re: Broken frac function in FPC3.1.1 / Windows x86_64

I’ve only tested it in Delphi, so you’ll have to convert it to the right syntax for fpc, but either of these should do:

function Frac1(const X: Double): Double;

asm

  .align 16

  .noframe

  movq      rdx, xmm0

  shl       rdx, 32

  and       edx, $7FF00000

  cmp       edx, $43300000

  jge      @@zero

  cmp       edx, $3FE00000

  jbe      @@skip

  cvttsd2si rax, xmm0

  cvtsi2sd  xmm4, rax

  subsd     xmm0, xmm4

  jmp      @@skip

@@zero:

  xorpd     xmm0, xmm0

@@skip:

end;

function Frac2(const X: Double): Double;

asm

  .align 16

  .noframe

  movq      rdx, xmm0

  shl       rdx, 48

  and       dx, $7FF0

  cmp       dx, $4330

  jge      @@zero

  cmp       dx, $3FE0

  jbe      @@skip

  cvttsd2si rax, xmm0

  cvtsi2sd  xmm4, rax

  subsd     xmm0, xmm4

  jmp      @@skip

@@zero:

  xorpd     xmm0, xmm0

@@skip:

end;

From: fpc-devel <fpc-devel-bounces at lists.freepascal.org <mailto:fpc-devel-bounces at lists.freepascal.org> > On Behalf Of Sven Barth via fpc-devel
Sent: Friday, 27 April 2018 23:47
To: FPC developers' list <fpc-devel at lists.freepascal.org <mailto:fpc-devel at lists.freepascal.org> >
Cc: Sven Barth <pascaldragon at googlemail.com <mailto:pascaldragon at googlemail.com> >
Subject: *** GMX Spamverdacht *** Re: [fpc-devel] Broken frac function in FPC3.1.1 / Windows x86_64

Bart <bartjunk64 at gmail.com <mailto:bartjunk64 at gmail.com> > schrieb am Fr., 27. Apr. 2018, 13:42:

On Wed, Apr 25, 2018 at 11:04 AM,  <info at wolfgang-ehrhardt.de <mailto:info at wolfgang-ehrhardt.de> > wrote:

> If you compile and run this 64-bit program on Win 64 you get a crash

And AFAICS your analysis of the cause (see bugtracker) is correct as well.

function fpc_frac_real(d: ValReal) : ValReal;compilerproc; assembler;
nostackframe;
  asm
    cvttsd2si %xmm0,%rax
    { Windows defines %xmm4 and %xmm5 as first non-parameter volatile registers;
      on SYSV systems all are considered as such, so use %xmm4 }
    cvtsi2sd %rax,%xmm4
    subsd %xmm4,%xmm0
  end;

CVTTSD2SI — Convert with Truncation Scalar Double-Precision
Floating-Point Value to Signed Integer
This should not be used to get a ValReal result.

The code essentially does the following (instruction by instruction):

=== code begin ===

tmpi := int64(d - trunc(d));

tmpd := double(tmpi);

Result := d - tmpd;

=== code end ===

Though why it fails with the given value is a different topic... 

Regards, 

Sven 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20180428/89adf484/attachment.html>