<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Heh, I had a feeling I was overcomplicating it - thanks Sven. 

      I'll start experimenting on that one.</p>

    <p>On a similar note, my branch has a number of other merge requests

      embedded in it, namely the "typeconv-strip" and "nothing-strip"

      branches (!232 and !342 respectively) and these are requred to

      simplify the node trees during purity analysis.  Some node classes

      that represent intrinsics may require additional programming in

      order to convert them into their actual results.<br>

    </p>

    <p>I'm also trying to think what cases I'll need to cover, both

      positive and negative - for example:</p>

    <p>- Calling a pure function with a variable actual parameter (the

      formal parameter is constant or passed by value) will call the

      function conventionally, but not flag it as impure.<br>

      - If a function is marked as pure, any references to non-local

      symbols (other than constants that can be immediately resolved)

      will throw a compiler warning and mark the function as impure.<br>

      - Any code path that results in a non-deterministic result (might

      only be possible when trying to analyse it for a given input) will

      throw a compiler warning and mark the function as impure.<br>

      - If purity analysis throws a compiler error (e.g. a division by

      zero), this will cause the program to fail to compile.</p>

    <p>Some cases might be immediate compiler errors though - for

      example, explicitly raising an exception in a pure function is

      essentially forbidden.</p>

    <p>To better explain how purity analysis currently works (I'm sure

      there's a better name than "purity analysis"), it takes a copy of

      the unoptimised node tree (this is the same as the tree used for

      inline, and for a space saving, they share the same field), adds

      explicit definitions for the formal parameters so they equal the

      constants passed in, and then tries to collapse the node tree down

      to a single assignment to the result.  This is done by running the

      following operations in this order:</p>

    <p>- Constant propagation<br>

      - For-loop unrolling<br>

      - Inline simplify (this will remove a lot of conditional branches

      from the tree; without it, the first pass below will raise an

      error if it comes across "if (Y <> 0) then Result := X mod

      Y;", for example)<br>

      - First pass (needed to parse nested pure functions)<br>

      - Dead store elimination</p>

    <p>The process is repeated until no more changes are made.</p>

    <p>There are some bailout conditions, some of which I don't know

      should be considered 'critical' (i.e. mark the function as

      'impure') or not:</p>

    <p>- If the recursion exceeds a certain depth (currently set to 64).<br>

      - If the node count exceeds a certain limit (currently set to

      1,024).<br>

      - If a for-loop could not be successfully unrolled (this generally

      only happens if the start and end values are not constant).<br>

      - The first pass stage raises a compiler error (this is a genuine

      error case and the compiler will not build the program).<br>

      - The node tree does not simplify into a simple assignment to the

      result.</p>

    <p>For the recursion and node count limits, I plan to test it on the

      Ackermann function:</p>

    <p>function Ackermann(m, n: Cardinal): Cardinal; pure;<br>

      begin<br>

        if m = 0 then<br>

          Result := n + 1<br>

        else if n = 0 then<br>

          Result := Ackermann(m - 1, 1)<br>

        else<br>

          Result := Ackermann(m - 1, Ackermann(m, n - 1));<br>

      end;</p>

    <p>This is also a good example to test for overflow conditions

      (which should be classed as an error), since the correct answer to

      Ackermann(4, 2), for example, has 19,729 decimal digits!<br>

    </p>

    <p>Note: marking the function as impure, when the compiler knows

      that a function is not actually pure in some circumstances, will

      speed up further compilation.</p>

    <p>In the end, I will also try to implement a system where, after an

      output is calculated for a given input, the compiler will try to

      remember it so if the same inputs appear again in the program, the

      compiler does not have to recalculate the output and can just

      retrieve it.<br>

    </p>

    <p>Kit<br>

    </p>

    <div class="moz-cite-prefix">On 14/12/2022 10:18, Sven Barth via

      fpc-devel wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAFMUeB-2dX7rAJj6Kh+rnaeM_cfH+OHRxZkf6QQRXaboiAOGRw@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="auto">

        <div>

          <div class="gmail_quote">

            <div dir="ltr" class="gmail_attr">J. Gareth Moreton via

              fpc-devel <<a

                href="mailto:fpc-devel@lists.freepascal.org"

                moz-do-not-send="true" class="moz-txt-link-freetext">fpc-devel@lists.freepascal.org</a>>

              schrieb am Di., 13. Dez. 2022, 22:09:<br>

            </div>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">The next

              big milestone that I want to achieve is to make this a

              pure <br>

              function:<br>

              <br>

              procedure int_str_unsigned(l:longword;out s:shortstring);

              pure;<br>

              var<br>

                 m1 : longword;<br>

                 pcstart,<br>

                 pc2start,<br>

                 pc,pc2 : pchar;<br>

                 hs : string[32];<br>

                 overflow : longint;<br>

              begin<br>

                 pc2start:=@s[1];<br>

                 pc2:=pc2start;<br>

                 pcstart:=pchar(@hs[0]);<br>

                 pc:=pcstart;<br>

                 repeat<br>

                   inc(pc);<br>

                   m1:=l div 10;<br>

                   pc^:=char(l-(m1*10)+byte('0'));<br>

                   l:=m1;<br>

                 until l=0;<br>

                 overflow:=(pc-pcstart)-high(s);<br>

                 if overflow>0 then<br>

                   inc(pcstart,overflow);<br>

                 while (pc>pcstart) do<br>

                   begin<br>

                     pc2^:=pc^;<br>

                     inc(pc2);<br>

                     dec(pc);<br>

                   end;<br>

                 s[0]:=char(pc2-pc2start);<br>

              end;<br>

              <br>

              This is essentially the core internal function that drives

              IntToStr and <br>

              similar functions.  The challenges here include:<br>

              <br>

              - A repeat...until loop with no obvious termination

              sequence.<br>

              - Using a pointer to access an array offset of a local

              variable.<br>

              - Writing characters (and the length field) one at a time

              to a shortstring.<br>

              <br>

              The reason for wishing to make IntToStr a pure function is

              that for a <br>

              given input, the output will never change, and it's

              perfectly feasible <br>

              for some other function to call IntToStr as part of a

              string generation <br>

              routine and which would otherwise itself be a pure

              function (if a pure <br>

              function wishes to call another function, it must also be

              determined to <br>

              be pure... see pure1b.pp for the recursive example where

              the actual <br>

              parameter isn't even a constant, but is nonetheless

              deterministic).<br>

            </blockquote>

          </div>

        </div>

        <div dir="auto"><br>

        </div>

        <div dir="auto">Wouldn't it make more sense to ensure that the

          Str() and Val() intrinsic work correctly inside "pure"

          functions? After all the compiler can then simply evaluate the

          inlinen nodes and does not have to "interpret" a ton of other

          code as well.</div>

        <div dir="auto"><br>

        </div>

        <div dir="auto">Regards, </div>

        <div dir="auto">Sven </div>

      </div>

      <br>

      <fieldset class="moz-mime-attachment-header"></fieldset>

      <pre class="moz-quote-pre" wrap="">_______________________________________________

fpc-devel maillist  -  <a class="moz-txt-link-abbreviated" href="mailto:fpc-devel@lists.freepascal.org">fpc-devel@lists.freepascal.org</a>

<a class="moz-txt-link-freetext" href="https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel">https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel</a>

</pre>

    </blockquote>

  </body>

</html>