<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Il 23/08/2018 11:34, Marco Borsari via fpc-pascal ha scritto:<br>

    </p>

    <blockquote type="cite"

      cite="mid:20180823113450.4c56c33ffc14840b88ee5f1b@libero.it">

      <pre wrap="">It would be for the Wirth optimization in the access of an array,

when the index is 0 or 1, allowing the elimination of a multiplication.</pre>

    </blockquote>

    <br>

    I'm afraid that this sort of optimization is rather outdated, and

    doesn't take into account a number of factors which make modern

    processors behave quite differently from old times ones.<br>

    <br>

    First of all, saving a multiplication was important at the times

    when even integer multiplication was very costly in terms of machine

    cycles. Today one must carefully check if there's really any saving

    in replacing a multiplication (which is nowadays very fast) in just

    a couple of cases (index 0 and 1) with a number of checks and jumps

    for *all* cases.<br>

    Moreover, any compiler will align your array elements on 32bits

    boundaries (or 64bits if you're in a 64bits platform) for better

    efficiency, so that your multiplication will be a multiplication by

    a power of two, which all the modern compilers translate into a much

    faster shl. No doubt that adding checks and jumps to save a shl

    doesn't make much sense.<br>

    <br>

    The second important factor to take into account is that modern

    processors do not execute instructions taken directly from the

    slower main memory, but they fetch them from the much faster cache.

    A jump instruction, if the target isn't very near, will generate a

    cache flush, thus dramatically increasing the execution time. That's

    why modern compilers try to avoid a branch table whenever possible.

    In the example you mentioned earlier (<a moz-do-not-send="true"

      href="http://www.mindfruit.co.uk/2012/01/switch-blocks-jump-tables.html">http://www.mindfruit.co.uk/2012/01/switch-blocks-jump-tables.html</a>)

    you can read that gcc translates a switch (= fpc case) construct

    into a branch table with -O1 (minimal optimization), but with higher

    optimization (-O3) it avoids the branch table and translates the

    construct into a sequel of conditional jumps.<br>

    In any case for the above reason (avoid cache flushing with a jump

    target too far away) the code that I had proposed (branch table in

    the data area) is preferable to the one of your last attempt (branch

    table in the code area).<br>

    <br>

    You may easily verify what solution provides the best efficiency in

    your platform, by performing a high number of iterations over your

    code, with random idx's in the range of your array size, and getting

    the total execution time (sort of: <b>time</b> your_prog). Keeping

    in mind that on a different platform (different cache size) it could

    behave quite differently.<br>

    <br>

    Giuliano <br>

    <pre class="moz-signature" cols="72">-- 

Do not do to others as you would have them do to you.They might have different tastes.</pre>

  </body>

</html>