<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>Il 23/08/2018 11:34, Marco Borsari via fpc-pascal ha scritto:<br>
</p>
<blockquote type="cite"
cite="mid:20180823113450.4c56c33ffc14840b88ee5f1b@libero.it">
<pre wrap="">It would be for the Wirth optimization in the access of an array,
when the index is 0 or 1, allowing the elimination of a multiplication.</pre>
</blockquote>
<br>
I'm afraid that this sort of optimization is rather outdated, and
doesn't take into account a number of factors which make modern
processors behave quite differently from old times ones.<br>
<br>
First of all, saving a multiplication was important at the times
when even integer multiplication was very costly in terms of machine
cycles. Today one must carefully check if there's really any saving
in replacing a multiplication (which is nowadays very fast) in just
a couple of cases (index 0 and 1) with a number of checks and jumps
for *all* cases.<br>
Moreover, any compiler will align your array elements on 32bits
boundaries (or 64bits if you're in a 64bits platform) for better
efficiency, so that your multiplication will be a multiplication by
a power of two, which all the modern compilers translate into a much
faster shl. No doubt that adding checks and jumps to save a shl
doesn't make much sense.<br>
<br>
The second important factor to take into account is that modern
processors do not execute instructions taken directly from the
slower main memory, but they fetch them from the much faster cache.
A jump instruction, if the target isn't very near, will generate a
cache flush, thus dramatically increasing the execution time. That's
why modern compilers try to avoid a branch table whenever possible.
In the example you mentioned earlier (<a moz-do-not-send="true"
href="http://www.mindfruit.co.uk/2012/01/switch-blocks-jump-tables.html">http://www.mindfruit.co.uk/2012/01/switch-blocks-jump-tables.html</a>)
you can read that gcc translates a switch (= fpc case) construct
into a branch table with -O1 (minimal optimization), but with higher
optimization (-O3) it avoids the branch table and translates the
construct into a sequel of conditional jumps.<br>
In any case for the above reason (avoid cache flushing with a jump
target too far away) the code that I had proposed (branch table in
the data area) is preferable to the one of your last attempt (branch
table in the code area).<br>
<br>
You may easily verify what solution provides the best efficiency in
your platform, by performing a high number of iterations over your
code, with random idx's in the range of your array size, and getting
the total execution time (sort of: <b>time</b> your_prog). Keeping
in mind that on a different platform (different cache size) it could
behave quite differently.<br>
<br>
Giuliano <br>
<pre class="moz-signature" cols="72">--
Do not do to others as you would have them do to you.They might have different tastes.</pre>
</body>
</html>