[fpc-devel] System 370: Episode 3. Addressing and it's limits Part One!

Mon Feb 6 18:55:37 CET 2012

Episode 3.  Addressing and it's limits Part One!

First, let me apologise for this post as it's going to be a large one.  Second,
I don't talk about 64 bit modes here because I have never used them.  But the
basics will not have altered.  IBM really does put a lot of effort into
maintaining backwards compatibility.

Secondly, I don't actually know anything about the internals of FreePascal or
any other compiler come to that,  some, or all, of the techniques discussed
here, and in part 2, may be impractical or even impossible to implement.  It
should be noted however that it is not an exhaustive list.

Thirdly, it should be noted here that if the intention is to provide support on
Hercules based systems, that Hercules allows us to use the newer instructions
introduced by the processor upgrades even though we are using processors that
shouldn't, in theory, support them. This doesn't apply to providing 31 or 64 bit
addressing however, as considerable operating system support is required to
handle these modes.

Finally, I may include bits of 370 assembler in this post.  I don't see how I
can avoid it.  I will try and keep them as brief and non-technical as possible,
but if you feel your eyes glazing over, ask and I will try and explain another
way.

                  ------------------------------------

So, does 370 architecture have a 4k limit on code and data?  Well, yes... and
no... Sort of... maybe...  It depends...

Prior to the upgrades of the 390 processor there was only 1 addressing mode,
Base / Displacement or effective addressing.  The newer processors introduced
Program Counter (it's called the PSW on 390 systems) PC relative addressing
but it only applies to code and, perhaps constants, and then only to some
instructions; It doesn't apply to data, so the limits are still relevant.

Base / Displacement consists of a 16 bit value, the first 4 bits enumerate a
register, and the other 12 bits hold a displacement from 0 to 4095. The actual
or Effective address for each storage operand is calculated as the unsigned
addition of the value held in the base register to the displacement from the
instruction itself.

The effective address for each storage reference is real or virtual and 24 or
31 bit depending upon the mode the processor is in at the time. In our case it
will, probably, always be a virtual address.

It should be noted that the base register may not be register 0. Register 0 has
an implied value of 0 when used for addressing purposes.

It is plain that each instruction reference of the Base / Displacement form can
only reference a range of 4k, hence the urban myth that that this is a limit on
the size of a module.  This is where USING enters the fray.  USING is an
instruction to the assembler.  It tells it that a particular register holds the
address (24 or 31 bit) of the label mentioned.  It is still up to the
programmer to load that address into the register, the assembler
won't (actually can't) do this for us.

Throughout, I am assuming that we will be using what IBM defines as standard
linkage conventions between modules.  Let's start with a basic bit of code that
represents a function:
PROG     CSECT              defines the name of our function
         £START             set up standard linkage  
         LR    R12,R15      R15 has the address of PROG, copy it to R12
         USING PROG,R12     Tell the assembler to use R12 as base
           <code goes here>
         £END               return to caller
SAVEAREA DS    18F          save area for standard linkage         
           <working storage goes here>
LITPOOL  LTORG
           <constants get defined here>
         END
£START and £END are macros to set up the standard linkage stuff, SAVEAREA is a
required area.  The details don't really matter.  If the total size of the code,
working storage and constants grows beyond 4k, we will get assembly errors.

However, we can use the USING instruction to help us out here.  Part of the
standard linkage is that R13 has to point to an area of 18 fullwords (32 bits
each).  By adding a USING SAVEAREA,R13 to our code;
PROG     CSECT              defines the name of our function
         £START             set up standard linkage
         LR    R12,R15      R15 has the address of PROG, copy it to R12
         USING PROG,R12     Tell the assembler to use R12 as base
         USING SAVEAREA,R13 use R13 as base register for working storage
           <code goes here>
         £END               return to caller
SAVEAREA DS    18F          save area for standard linkage
           <working storage goes here>
LITPOOL  LTORG
           <constants get defined here>
         END
Now we have defined 2 base registers.  We are not allocating an extra register,
we have to use R13 as a save area pointer anyway.  We are using it to address
storage after the save area.  Now our code can be up to 4k, and our working
storage plus constants can be 4k;  8k in total but still limited.

One final example and we'll call it a day for part 1.  This post is long enough
as it is.
PROG     CSECT              defines the name of our function
         £START             set up standard linkage
         LR    R12,R15      R15 has the address of PROG, copy it to R12
         LR    R11,R12      we can set up a second base for the code
         AH    R11,=H'4096' by pointing it 4k past the first one
         USING PROG,R12,R11 Tell the assembler to use R12 and R11 as bases
         LA    R10,LITPOOL  and we can address the literal pool separately.
         USING LITPOOL,R10
         USING SAVEAREA,R13 use R13 as base register for working storage
           <code goes here>
         £END               return to caller
SAVEAREA DS    18F          save area for standard linkage
           <working storage goes here>
LITPOOL  LTORG
           <constants get defined here>
         END
Here we have set up s second register, R11, to point 4k past R12 and we have
use d this as a base.  The code segment can now be 8k.  We have also added a
separate register R10, to handle the literals.  We now have 16k we can address.
A further refinement we can pull is to address all the initialisation code with
R12.  When we enter the main code, we reset R12 to the start of the main code.
Similarly with the exit code.  This could give us upto 12k with one base
register.

But there is a limit to the registers we toss around like this and anything
more complicated than the above would, if were coding by hand, probably get
split into two or more modules.