[fpc-pascal]Execution speed

Lee, John John.Lee at logicacmg.com
Tue Jan 28 20:12:35 CET 2003

So now you've had Marco's ideas + my 0.1 euro's worth (there are some
differences in our approaches + some common ideas) re what the problem is &
how you should test what's happening, please let us know what the results of
your experiments are...J 
 Now that the list seems to be working again...I'll try posting this again:
> I am wondering why our new PC is not executing our fpc-compiled program
> very much faster than the old one. It was really quite a disappointment:
> Old PC: Laptop, Intel PII, 300 MHz, 64 MB. Execution times: 8:30, 2:30
> New PC: Desktop, AMD Duron, 1.6 GHz, 128 MB. Execution times: 5:15, 1:15

I'm not the processor-crack of the FPC team, but I'll give it a shot. 
(Jonas and Florian will probably correct/comment on this heavily :-)

I'm afraid you have fell for the MHz'itis, iow that the throughput 
speed of a processor is purely dependant on the speed of the CPU 
(in MHz):

Some important things I noticed immediately from your msg:

- there is still a nearly two fold increase. (less for the first, exactly
twofold for the second)
- you use 4 MB memory, and I assume from the story that is rather 
random access
- The Duron has less cache than an Athlon, and the Duron's is probably about

   the same  magnitude as the P-II
- the 4 MB doesn't fit in the cache -> processor is waiting for memory all
the time.

> The new PC ought to be 5 times faster (1600 MHz / 300 MHz, right? 

Depends on the job. The memory interface is probably only two 
times faster (66 MHz <-> 133 MHz) or so, and the cache (that can 
in some cases "hide" the slower memory), is also hardly larger.

>Of course the speed of the memory is also a factor) but it's not even twice
> as fast.

Which is indeed the reason that it is memory bound. (together with 
the problem being not OS dependant, I assume you tried some *nix)

I went from a K6-2 500 to an Athlon 1666 (XP2000+), which is 
about a fat 3 step, but the compiler compiles itself more than 3 times 
as fast.

> The execution time pairs are determined from three time stamps that
> occur during one run of the program. The sequence is as follows:
> * Stamp 1
> -Initialize (5-10 secs reading/processing from HD)
> -Process 1 (5-9 mins)
> * Stamp 2
> -Process 2 (1-3 mins)
> * Stamp 3

Since the second process scales better, I assume it approaches 
memory in a way that can be better

> Both machines are running Win98 Second Edition (could Windows 98 be
> preventing the faster machine from running at full capacity?

Not for pure calculation I think. Maybe for heavily IO-bound or 
threading programs 98 makes a huge difference, but if there is a 
difference in calculation speed in 98, it won't be more than a few 
percent (and since NT and unix have more to do in the background, 
this could even be positive)

> Or perhaps it's because fpc runs in a DOS window, and the DOS mode is
forcing it to
> run slow?)
> The program is very processor intensive. Only about 4MB of memory space
> is used.

You could try to change the memory usage in a way that 
subsequent memory access will be adjacent in memory, and play 
with alignments.

You could also try to find/borrow a processor with a large cache 
(e.g. a P-III Xeon with 2 MB cache would be ideal, but an Athlon MP 
or even a simple Athlon would be interesting), and do the test on 
such a machine.

> During runtime, we are doing less than 400 kb of read/write combined to
> the HD. We put about 10 lines of text on the DOS screen to show
> progress. So I can't imagine the I/O could be slowing us down.

Not likely no.

> I tried compiling with the two different target platforms, but it didn't
> make a difference. Stackchecking is on, but it was on on both computers.

Did you use the same amounts of optimization? Maybe you 
have -OG3p3r or so in the ppc386.cfg on the P-II (which 
automatically adds the heaviest optimizations), and not on the Duron.

> I also tried a few different bios settings (the computer has ready-made
> bios configurations for "Optimal" and "Best Performance" (?) as well as
> the factory default I started with.) But the compile times were the same
> regardless of the bios settings.

Usually this is a few percent max, not magnitudes.

Action list: (in order that I would do them, from first to last resort)
1 verify that you use the same degree of optimizations. 
2 Try on a machine with more cache.
3 Try to rewrite programs to do more accesses to the same block of 
fpc-pascal maillist  -  fpc-pascal at lists.freepascal.org

This e-mail and any attachment is for authorised use by the intended recipient(s) only.  It may contain proprietary material, confidential information and/or be subject to legal privilege.  It should not be copied, disclosed to, retained or used by, any other party.  If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender.  Thank you.

More information about the fpc-pascal mailing list