<div dir="ltr">Thanks...</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Feb 18, 2020 at 8:36 PM Alexander Hofmann via fpc-devel <<a href="mailto:fpc-devel@lists.freepascal.org">fpc-devel@lists.freepascal.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>Dear all,</p>
<p>while investigating a bug in an application designed for ARM with
floating point emulation enabled, I stumbled upon the following
problem:</p>
<p>When rounding large numbers to int64, some actually get rounded
to something negative, e.g:</p>
<p> round(1.5000000000000000E+018) => 1500000000000000000<br>
round(1.5000005497558139E+018) => -491856199680</p>
<p>Tested with trunk and 3.0.0 on a raspberry pi (armsf).<br>
</p>
<p>At first I though it's specific to ARM, but it can be reproduced
also on e.g. x86_64 by copying <tt>fpc_round_real</tt> from <tt>rtl/genmath.inc</tt>
and using this directly. By the way, the above fractional numbers
differ by only one bit, which is bit 32 (or the sign bit in a
32bit number).</p>
<p>I think the culprit is in line 1342 of genmath.inc, i.e.</p>
<p><tt> result:=((int64(hx) shl 32) or float64low(d)) shl
(j0-52);</tt></p>
<p><tt>float64low(d)</tt> will return the lower 32 bit of the float
as longint, where the sign is negative in case of the second
number. The compiler seems to expand this to a 64bit signed
integer, by keeping the number not the bits - thus the invalid
result. If this line is changed to</p>
<p><tt> result:=((int64(hx) shl 32) or <b>dword</b>(float64low(d)))
shl (j0-52);</tt></p>
<p>both floats are rounded correctly. It needs to be an unsigned
number, it does not work with an explicit cast to int64.</p>
<p>Note that this code-path will only be put into action if the
exponent of the base-two fractional number is larger 51 and the
float thus lacks the fractional part. In my point of view, the
rest of the code is not prone to this error; in line 1361 of <tt>genmath.inc</tt>
the result of <tt>float64low</tt> is shifted right first, which
will always leave a 0 at the position of the "sign bit".</p>
<p>I am, however, not sure if this is because my compilers all have
been compiled e.g. with the wrong switches (leading to some
obscure optimization) or if this is indeed an error. I didn't find
anything about this on the list or the net.<br>
</p>
<p>Attached is a little program illustrating the problem, most of
the code is a 1:1 copy from <tt>genmath.inc</tt></p>
<p><br>
</p>
<p>With best regards,</p>
<p>Alex<br>
</p>
</div>
_______________________________________________<br>
fpc-devel maillist - <a href="mailto:fpc-devel@lists.freepascal.org" target="_blank">fpc-devel@lists.freepascal.org</a><br>
<a href="https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel" rel="noreferrer" target="_blank">https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel</a><br>
</blockquote></div>