[fpc-devel] Broken frac function in FPC3.1.1 / Windows x86_64
J. Gareth Moreton
gareth at moreton-family.com
Sun Apr 29 04:36:16 CEST 2018
As an extra point, removing the 'skip' check (i.e. cmp ax, $3FE0, jbe
@@skip) removes 6 bytes from the code size and shaves about 2 to 3
nanoseconds off the execution time in most cases, and it could be argued
that it's worth going for the 'no skip' version because using Frac on a
value of x where |x| < 1 is rather uncommon compared to when |x| >= 1.
However, when running my timing tests, one thing that's confused me is
that when using very large inputs like 10^300, the function is at least 5
nanoseconds slower than FracSkip2, even though the code is less complex.
This happens even if I put 'align 16' before the @@zero label.
I did wonder if it being a debug build caused some issues, but when I
compiled it with full optimisation, both versions of the functions ran
slower for numbers of that size (and the original FracDoSkip took about
just as long), and SafeFrac beat them by around 5 nanoseconds.
Nevertheless, I conclude that for most situations, using the improved
FracNoSkip gives the best performance and size for typical inputs, but this
may depend on an individual machine's architecture.
****
function FracNoSkp2(const X: ValReal): ValReal; assembler; nostackframe;
asm
movq rax, xmm0
shr rax, 48
and ax, $7FF0
cmp ax, $4330
jge @@zero
cvttsd2si rax, xmm0
cvtsi2sd xmm4, rax
subsd xmm0, xmm4
ret
@@zero:
xorpd xmm0, xmm0
end;
****
Note: 'align 16' at the start of a procedure is usually unnecessary, as
FPC aligns procedures to 16-byte boundaries automatically. FracNoSkp2 has
a code size of 39 bytes, so will fill 48 bytes (3 blocks), which is a block
smaller than the original FracNoSkip and the current Frac function.
I've attached my test project to this e-mail if you wish to look at the
figures yourselves (I hope attachments work) and make a more informed
decision. This will currenly only run on Windows due to the use of
QueryPerformanceCounter for timing checks. These calls will need to be
removed to run this on Linux.
Gareth aka. Kit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20180429/d5500310/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fractest.lpr
Type: application/octet-stream
Size: 7259 bytes
Desc: not available
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20180429/d5500310/attachment.obj>
More information about the fpc-devel
mailing list