[fpc-devel] register allocator seems to be using S20 for two things at the same time (related to armhf porting work)

Daniël Mantione daniel.mantione at freepascal.org
Sun Mar 18 15:30:18 CET 2012



Op Sun, 18 Mar 2012, schreef peter green:

> Daniël Mantione wrote:
>> Please use the command line option -sr to check the generated code before 
>> register allocation. 
> Done and attatched.
>> You can likely find the cause in there.
> The code with imaginary registers looks correct to me. It seems to load each 
> parameter into a seperate even numbered imaginary register (using odd 
> numbered imaginary registers as temporaries in the process) allocating them 
> as it goes. Then copies them to the locations needed for the function call 
> deallocating them as it goes.

I agree, both imaginary registers are correctly allocated and freed.

> So it seems to me that the problem is in the translation of the form using 
> imaginary registers to the form using real registers. Can you explain (or 
> point me to documentation on) how this form is translated into a form using 
> real registers.

The algorithm used is called "iterated register coalescing", an advanced 
form of graph colouting and was designed by Andrew W. Appel. He describes 
in detail in his book "Modern Compiler Implementation in C".

Basically the registers are put into "worklists", and then 4 procedures:
* simplify   -> takes registers out of the graph that can safely be coloured
* coalesce   -> tries to coalesce registers together to reduce pressure
* freeze     -> selects registers that have a chance to be coalesced and
                 should therefore not be spilled yet
* spill      -> selects a register that should move to memory.

These procedures are called iteratively until the worklists are empty. 
Then the graph is coloured.

> In particular what exactly happens when there are not enough 
> real registers free to assign a real register for every imaginary register 
> that is in use at a given time?

Then a register is spilled, i.e. replace by a location in memory. This may 
be possible without new registers:

mov ireg30d,ireg29d    ->     mov ireg30d,[ebp-40]

... but in some cases a help register is needed:

mov ireg30d,[ebp+20]   ->     mov ireg99d,[ebp+20]
                               mov [ebp-40],ireg99d

Should a new register is needed, register allocation is 
completely restarted with the new code.

> My current suspiscion is that something is missing regarding handling of 
> running out of VFP registers and it hasn't been noticed before because noone 
> has tried to do what i'm doing (implementing a calling convention using VFP 
> registers and then stress testing it) but i've no idea where to look in the 
> sourcecode to confirm/refute that idea.

It doesn't look like spilling happens in your example: I don't see moves 
from/to the stack frame as temporary location.

Perhaps the first thing to check is to add a breakpoint in 
Tinterferencebitmap.setbitmap.

The "versions" of s20 use superregister 50 and 70, so a setbitmap (50,70) 
or (70,50) should be called at some point to tell the register allocator 
both registers are active at the same time and cannot be coalesced.

Daniël


More information about the fpc-devel mailing list