<div dir="ltr">Thanks...</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Feb 18, 2020 at 8:36 PM Alexander Hofmann via fpc-devel <<a href="mailto:fpc-devel@lists.freepascal.org">fpc-devel@lists.freepascal.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  
  <div>

    <p>Dear all,</p>

    <p>while investigating a bug in an application designed for ARM with

      floating point emulation enabled, I stumbled upon the following

      problem:</p>

    <p>When rounding large numbers to int64, some actually get rounded

      to something negative, e.g:</p>

    <p> round(1.5000000000000000E+018) => 1500000000000000000<br>

       round(1.5000005497558139E+018) => -491856199680</p>

    <p>Tested with trunk and 3.0.0 on a raspberry pi (armsf).<br>

    </p>

    <p>At first I though it's specific to ARM, but it can be reproduced

      also on e.g. x86_64 by copying <tt>fpc_round_real</tt> from <tt>rtl/genmath.inc</tt>

      and using this directly. By the way, the above fractional numbers

      differ by only one bit, which is bit 32 (or the sign bit in a

      32bit number).</p>

    <p>I think the culprit is in line 1342 of genmath.inc, i.e.</p>

    <p><tt>            result:=((int64(hx) shl 32) or float64low(d)) shl

        (j0-52);</tt></p>

    <p><tt>float64low(d)</tt> will return the lower 32 bit of the float

      as longint, where the sign is negative in case of the second

      number. The compiler seems to expand this to a 64bit signed

      integer, by keeping the number not the bits - thus the invalid

      result. If this line is changed to</p>

    <p><tt>            result:=((int64(hx) shl 32) or <b>dword</b>(float64low(d)))

        shl (j0-52);</tt></p>

    <p>both floats are rounded correctly. It needs to be an unsigned

      number, it does not work with an explicit cast to int64.</p>

    <p>Note that this code-path will only be put into action if the

      exponent of the base-two fractional number is larger 51 and the

      float thus lacks the fractional part. In my point of view, the

      rest of the code is not prone to this error; in line 1361 of <tt>genmath.inc</tt>

      the result of <tt>float64low</tt> is shifted right first, which

      will always leave a 0 at the position of the "sign bit".</p>

    <p>I am, however, not sure if this is because my compilers all have

      been compiled e.g. with the wrong switches (leading to some

      obscure optimization) or if this is indeed an error. I didn't find

      anything about this on the list or the net.<br>

    </p>

    <p>Attached is a little program illustrating the problem, most of

      the code is a 1:1 copy from <tt>genmath.inc</tt></p>

    <p><br>

    </p>

    <p>With best regards,</p>

    <p>Alex<br>

    </p>

  </div>


_______________________________________________<br>

fpc-devel maillist  -  <a href="mailto:fpc-devel@lists.freepascal.org" target="_blank">fpc-devel@lists.freepascal.org</a><br>

<a href="https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel" rel="noreferrer" target="_blank">https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel</a><br>

</blockquote></div>