[fpc-devel] Parallel processing in the compiler

Mon Sep 6 03:48:36 CEST 2010

Jonas Maebe schrieb:

> Nobody ever said that it would be easy to restructure the compiler. In 
> fact, I will be the first to admit that it is incredibly hard, and since 
> the compiler is 15 years old (although several parts have been rewritten 
> over the years), there are bound to be quite a few less than ideal 
> situations and things that simply grew over time. However, I very much 
> dislike the way you are doing the development and find it very hard to 
> see how we are ever going to be able to integrate any of those changes 
> back into trunk.

Can you give me some hints, on how to restructure the compiler in the 
most compatible way, based on your experience?

As you can see from my many attempts (public and private branches), to 
attack the problem from different sides, I'm open to any strategy :-)

> Maybe I simply should stop being involved in this, since I'm obviously 
> not able to productively deal with this project.

Please stay involved! Since I'm very new to the fpc project, I need 
assistance by more experienced developers.

>>> You have to describe in your svn commit logs what you actually do.
>>> "NG: fixed make all" is not a useful commit message. You changed 149
>>> files in that commit, and did not describe what was done nor why:
>>
>> I only commited these changes so that they don't get lost.
> 
> You should never commit unrelated changes in a *single* commit. Even if 
> you don't want to create a separate branch for other changes, at the 
> very least use separate commits (although unrelated changes will make 
> merging more difficult in case later changes also depend on them).

ACK. That's why I asked for timely reviews of my work, so that the trunk 
can be updated in sync with my progress, and bad attempts can be 
detected and avoided in an early stage.

>> Please review the preceding version, as mentioned in the Readme.
> 
> The main problem is that it cannot be merged since it doesn't compile.

I only committed changes that compile on my machine (win32/64). The 
problemw with ppudump - and only there - was brought to my attention 
only recently, and its solution is not easy for me.

> As I mentioned earlier: the ideal situation are piecemeal changes that 
> form a tested and self-contained whole that can be easily reviewed and 
> merged. While a single change related to global variables can obviously 
> have repercussions all over the compiler, that does not mean that you 
> have to replace all global variables that at first sight should end up 
> in tscanner with fields in one go, especially if you are unable to 
> completely test those changes (make all, make fullcycle, running the 
> testsuite).

I actually have no clue on how to avoid the observed problems, based on 
the general decision that module-specific variables have to become 
fields or properties of TModule. This is a situation where I badly need 
assistance by the core developers...

All the suggested tests stopped working many weeks ago, so that I could 
proceed only with the compiler project only. Now that I found out that 
dropping ppudump from "make all" will allow to perform the tests, I can 
at least fix the real syntax errors in the code, occuring only in the 
generation of compilers for other targets.

WRT to the testsuite, the results are of little help (to me) as long as 
other users acknowledged that the test results don't match the given 
reference. Can somebody explain that known problem?

A really broken parser should manifest during make cycle, what can be 
excluded. So I currently only can compare the test suite results against 
those of the trunk version, not knowing what will happen on other host 
systems :-(

>> The last change was to remove ppudump from the Makefile, and this 
>> proved that so far only ppudump is sensitive to changes in the 
>> compiler internals.
> 
> It also proves that the internal compiler structure has been broken.

Can you enlighten me, what exactly broke the structure?

Apart from that I dare to decline, from the existence of a strong 
"structure" inside the compiler. IMO the structure is so fragile, that 
every reasonable refactoring attempt will hit a dead end very soon. But 
I'd be happy when somebody could proof me wrong :-)

>>> * you
>>> deleted the aasmsym unit and changed code that depended on it without
>>> any remarks why you did that. The comment in the header of that unit
>>> said: "Contains abstract assembler instructions for all processor 
>>> types, including routines which depend on the symbol table. These
>>> cannot be in aasmtai, because the symbol table units depend on that
>>> one."
>>
>> I wanted to make code generation more abstract, so that it can be 
>> separated better from the remaining code.
> 
> Removing an extra abstraction class and folding it into the base class 
> is the opposite of making something more abstract.

Perhaps the description of aasmsym was misleading me. I'll restore 
aasmsym and test again...

>> The GlobVar unit is intended as a place for all (previously) global 
>> variables, that can be used without importing further dependencies.
> 
> The aasmsym unit did not contain any global variables. And putting all 
> global variables in a single unit may not be the right approach since 
> the compiler consists of many components. They are obviously interwoven, 
> but making them more interwoven by basically making all types of all 
> global variables globally visible does not sound a like a good idea.

My idea was (and still is) to remove direct references to many units as 
far as possible, so that the interface parts of all units can be 
stripped accordingly, reducing the chance for cyclic unit references and 
increasing the chance for refactoring in other places. The only negative 
effect may be the inclusion of more units in related tools, e.g. 
ppudump. This problem may be solved in a later stage, when the new 
dependencies have settled down.

>>> You nevertheless did move that code to aasmtai and now there are
>>> indeed (implementation-level) circular dependencies between aasmtai
>>> and symsym. Adding extra circular dependencies to the compiler
>>> without any argument as to why this is required is not good.
>>
>> I don't remember details.
> 
> That's why you should write it in the svn log. Note that you can edit 
> the fpc repository's svn log entries even after you have committed, 
> using "svn pe -r <revnr> --revprop svn:log".

Thanks for this hint, but that doesn't help when I don't remember 
details any more :-(

>> My experimental approach left these units unmodified, and still 
>> resulted in importing code generation (see above).
> 
> That's because of other changes. Making the problem worse does not 
> contribute to the solution.

Right, those "other" changes turned out to break something in the 
existing structure. When we agree to make the mentioned approach the 
first candidate for inclusion into trunk, and figure out how the 
remaining required (currently breaking) changes can be made, it would 
make life easier for all of us. When no solutions for such early-stage 
refactoring can be found, all attempts towards parallel processing can 
be stopped immediately :-(

>>> * that
>>> commit also contains changes that have nothing to do with fixing
>>> "make all", such as fixing a typo in a comment, removing type
>>> redefinitions, commenting the assignment of tcgaddnode to caddnode in
>>> the init code of ncgadd.pas (and similar assignments in ncgmat),
>>> possibly an unrelated change in ppheap (which is wrongly indented),
>>> removing an unused local type definition from ppu.pas...
>>
>> I tried to follow the coding style, to not comment anything in the code.
> 
> Oh, please.

;-)

Actually I should start just another (private) branch, where I can 
update and reformat the trunk version according to my needs. The coding 
style(s), found in the trunk, makes it hard to figure out what's going 
on. Obstacles are deep indentation levels, and enlarged code sections, 
caused by arbitrary excessive indentation and the requirement to place 
every "begin" into a new line. My personal style results in more compact 
code, that can be analyzed easier. No offense intended, coding style in 
fact is a very personal thing, and a public project deserves *commonly* 
accepted rules. I only wanted to explain why I updated e.g. comments 
while working on the code, so that it's near impossible to commit such 
changes in distinct steps.

>> Please suggest an better and more systematic way of reducing the 
>> number of compiler hints, and supply a justification for redefinitions 
>> of system unit types.
> 
> I'm not saying that the local type definitions should stay. I'm saying 
> that removing them must not be committed together with completely 
> unrelated changes.

Okay, I'll try to commit more often in the future. My intent was to keep 
the repository small, by not committing every single change. When this 
is contraproductive, WRT to later merges, I'll have to change my habits. 
For the same reason I didn't create new experimental branches, after it 
turned out that most of them end up in dead ends, and nobody ever looked 
at them.

Maybe git will help to reduce such problems, but the learning curve... :-(

> There are more people that you alone, and many people have spent a lot 
> of time on answering many questions from you. Reviewing a bunch of 
> changes with little or no information about why which changes were done is
> a) a lot of work
> b) not very enticing

Thanks for spending your time on me :-)

DoDi