[fpc-devel] Parallel processing in the compiler
Hans-Peter Diettrich
DrDiettrich1 at aol.com
Mon Sep 6 05:22:32 CEST 2010
Florian Klämpfl schrieb:
>> Right, that's how it *should* be designed. But try to find out why the
>> code generation is added, when variables like current_filepos or
>> current_tokenpos are moved into TModule (current_module) :-(
>
> Why should current_filepos and current_tokenpos go into TModule? They
> can be perfectly threadvars.
Good point. So would it be sufficient to retype all such variables as
threadvar, in order to make parallel processing (threads) possible?
I have no real idea, how in such a model the initialization of the
threadvars has to be implemented. That's why I try to assign all
state-related variables to a definite object, whose reference can be
easily copied (or moved?) into any created thread.
Then we also have to draw a border, between possible parallel and still
required sequential processing. Sequential processing must be used for
all output, be binary or log files. Also encountered errors must be
reported from the originating thread to all related (main and other)
threads, when they require to stop and shut down compilation in an
orderly manner.
> Further, I don't see them fitting into
> TModule, they describe not the state of a module but are part of the
> compilation state. Even more, consider several threads compiling
> different procedure of one module: putting current_filepos and
> current_tokenpos into TModule won't work in this case.
Right, but I see no chance for such parallelism, before all related
variables have been found. See my questions about just these variables,
and the according tfileposinfo values in several objects.
Parallel code generation requires that the cg is separated from parsing,
so that the next procedure can be parsed while the previously parsed
procedures are compiled.
>> The last change was to remove ppudump from the Makefile, and this proved
>> that so far only ppudump is sensitive to changes in the compiler
> internals.
>
> Guess why I'am sceptical that it's usefull to use the compiler parser
> for other purposes like code tools or documentation: probably once per
> week a simple compiler change breaks this external usage (we made this
> experience ten years ago ;) ).
I've postponed that initial motivation, to the end of all other
refactoring. Apart from parallelism I see more chances for the
introduction of really new features in other places, like multiple
front-ends. Such projects require a separation of the mere parser from
the rest of the infrastructure, i.e. the handling of all symbols,
creation of nodes, etc. have to be moved into new and commonly usable
interfaces. After that step it also would be easy, and could break
nothing in the compiler, when a no-cpu target is added to the target
specific back-ends.
> How we can continue? I'll see if I find within the next week time (I
> were on holiday for one week) to review the noglobals changes and how we
> can split them into usable parts.
IMO the most important decision is about the general direction of the
refactoring. Do we want more OO (encapsulation), more codegen
separation, or what else. IMO encapsulation is the most useful first
step, towards any other goal. The current compiler "structure" is
dictated by purely *formal* aspects (unit dependencies), and does not
reflect the *logical* dependencies between objects, variables,
procedures etc. This lack of logical structure, next lack of up-to-date
documentation, is the most annoying problem with *every* compiler
enhancement attempt.
DoDi
More information about the fpc-devel
mailing list