[fpc-devel] Re: Comparison FPC 2.6.2 - Kylix 3

Florian Klämpfl florian at freepascal.org
Mon Mar 4 12:05:37 CET 2013


Am 04.03.2013 01:00, schrieb Graeme Geldenhuys:
> 
> 4.4 seconds (Kylix under Linux) vs 89 seconds (FPC under Linux)... That
> is just too a huge performance difference to justify. Yes, we all know
> the argument about more platforms, maintainable code etc, but that
> couldn't possible be the only reason for such a huge speed difference.
> Somewhere there is a serious bottleneck(s), or the FPC team simply
> disregard optimization completely. From why I have heard them say, the
> latter is more likely [unfortunately].

You completely miss the point. If there are only approx 25
features/properties which make the compiler each 10% slower than in
total FPC is 10 (1.1^25=10.9) times slower than before. Let me name some
of the stuff which might count only 10% each (some more, some less) but
in total makes FPC much slower:

- gcc compatible output format (*.ppu+*.o instead of one big ppu only
usable with FPC)
- flexible assembler output (different assemblers supported, assembler
listing output)
- flexible object file output (coff, elf, ...)
- portable ppus (regardless of the host system, ppus are bitwise equal.
this means e.g. a lot of endian conversion checks and calls)
- class helpers
- operator overloading
- generics
- architecture agnostic node tree
- architecture agnostic constant handling (96 bit arithmetics)
- portable code generator
- support of bit packed data structures
- flexible debugging info output
- completely written in a high level language
- architecture agnostic symtable handling
- using the rtl heap manager, a non threadsafe heap manager would be
probably slightly faster but do we want to maintain two full featured
heap managers?
- using ansistrings, one could switch to zero passed char arrays but do
we really want this?
- architecture agnostic handling of procedure parameter passing
- code page aware reading of input files
- different FPU types supported
- portable optimizer
- 32/64 bit assembler supporting all available instruction sets
- high level code generator layer for jvm support
- support of different pascal dialects
- portable and flexible file handling
- using classes instead of objects
- ...

And yes, the speed penalty of these features/properties often multiplies
and is not only additive. Of course, this is all simplified but it
should give you an idea where the factor 10 comes from. It is a lot of
small things none of them really counting but in total it's a lot
(exponential grow) and this makes it impossible to fix this without
sacrifying a lot of FPC's power.

FYI: FPC 1.x is several times faster than FPC 2.x
FYI2: Last time somebody tried (years ago, -/+ year 2000), FPC compiled
by Delphi was slower than FPC compiled by FPC.

My goal regarding the speed of FPC is: let Moore win. This means faster
CPUs should make FPC faster than new features in FPC make FPC slower and
this works for 20 years now. So nothing to worry.

>> You are only showing the Delphi/Kylix speed is
>> extremely superior
> 
> And Martin is just showing half the problem. The Delphi & Kylix
> compilers also produce executables that run 10+ times faster than what
> FPC 2.6.0 can produce. Even on the more optimized 32-bit compiler. And
> don't even think of mentioning that faster hardware will mask the
> problem - it doesn't. I have a i7-2660K running at 3.6Ghz with high
> performance RAM and 450MB read speed SSD. I noticed a > 10+ times
> difference in running executables on my hardware.

Then something with your code is wrong. Or you hit some strange
bottleneck, you should really profile your code.




More information about the fpc-devel mailing list