[fpc-devel] Re: Comparison FPC 2.6.2 - Kylix 3
Daniël Mantione
daniel.mantione at freepascal.org
Mon Mar 4 13:38:50 CET 2013
Op Mon, 4 Mar 2013, schreef Martin Schreiber:
> On Monday 04 March 2013 12:05:37 Florian Klämpfl wrote:
>> Am 04.03.2013 01:00, schrieb Graeme Geldenhuys:
>>> 4.4 seconds (Kylix under Linux) vs 89 seconds (FPC under Linux)... That
>>> is just too a huge performance difference to justify. Yes, we all know
>>> the argument about more platforms, maintainable code etc, but that
>>> couldn't possible be the only reason for such a huge speed difference.
>>> Somewhere there is a serious bottleneck(s), or the FPC team simply
>>> disregard optimization completely. From why I have heard them say, the
>>> latter is more likely [unfortunately].
>>
>> You completely miss the point. If there are only approx 25
>> features/properties which make the compiler each 10% slower than in
>> total FPC is 10 (1.1^25=10.9) times slower than before.
>
> Is this correct? It implies that every feature/property uses 100% of the total
> process. And if it is true it is absolutely necessary to stop adding features
> soon because 1.1^50 = 117.4. ;-)
Some features only request procesing power if you use them. However,
the features in Florian's list require continuous processing power. Two
examples how features can impact overall speed:
1. Operator overloading
Operators are some of the most common tokens in source code. Without
operator overloading, if you parse an operator, you simply generate a tree
node.
With operator overloading, for each operator that you parse, you have to
traverse all loaded units to check if the operator is overloaded. If there
are 50 units loaded, this means 50 symtable lookups, simply because the
operator might be overloaded.
For each operator overload candidate that is found, the compiler has
need to check for many possible type conversions to see if the candidate
can actually be used.
The situation with Pascal type conversion has grown increasingly complex
over the years. For example almost any type can be converted into a
variant, and a variant can be converted into almost any type. This
requires all kinds of special handling, not only to do the right thing,
but also not to do ineffcient type conversions.
So even if you don't use operator overloading or variants at all, they do
affect the compiler speed.
2. Layered code generation
The split of the code generation in a high-level and low-level layer,
means that for every node that is processed, first the high-level virtual
method is called, which in turn calls the lower level virtual method. Thus
you have an addition virtual method call for evey node processed.
The low level code generator, which is still mostly CPU independent, again
calls virtual methods from the abstract assembler layer to generate the
actual opcodes.
The abstract assembler in turn, has again to worry about multiple
assemblers which can emit the final object file.
Now each layer not just has its own code, but also its own type and
therefore conversion functions need to be called (for example a def has a
size), which is converted into a cgsize and ultimately into an opsize.
Obviously, if you just had one layer, and could output instruction
directly to the object file, you can save a lot of performance.
While you might develop for just one platform, the fact that many of them
are supported, costs compiler performance.
Daniël
More information about the fpc-devel
mailing list