[fpc-devel] Fwd: Re: An optimization suggestion for FPC

Jonas Maebe jonas at freepascal.org
Sun Jun 28 13:54:06 CEST 2020


[accidentally only sent to Gareth initially]

On 28/06/2020 12:31, J. Gareth Moreton wrote:
> So someone reached out to me directly again asking for an FPC
> optimisation.  Now I want to see if this is possible to optimise and
> won't break something or be annoying specific.

The general optimisation that would handle this is promoting individual
record members into standalone variables when possible. FPC currently
has no support at all for this.

An optimisation that's a bit less general (although orthogonal in some
cases, namely when you don't need to access individual members), is
keeping records as a whole in a register. FPC already has support for
this, see tstoreddef.is_intregable and tabstractvarsym.setregable.

It does not get triggered here on x86-64 because of another involved
method: the {$if} at the end of tabstractvarsym.is_regvar. That code
prevents records from being kept in registers if they are written to on
all architectures except for PowerPC and PowerPC64.

The reason for this is that other supported architectures lack
instructions to efficiently extract and insert bitfields from/into
integer registers (although perhaps some of the newer x86-64 include
them as part of an extension; and I think AArch64 and certain MIPS
subarchs could also support it efficiently). This means that to perform
an operation on a field of a record kept in a register, you have to do
the following in the general case:
1) extract the field. On generic x86, that would be a move to a
temporary register, then possibly a shift, and then possibly an "and".
2) perform the operation
3) possibly shift back the value to the corect position, clear it in the
original register (mask its position with 0), and then "or" the result
to insert it again

In this case, just loading a value from memory (probably L1 cache, since
register variables are only used locally within a single routine),
performing the operation, and storing it back, is quite likely to be
faster, and definitely results in much smaller code.

However, as you've undoubtedly realised, in this case none of that
shifting/masking would come into play, since the record only contains a
single field. So you could definitely add an exception for that case for
all architectures. We even have the perfect helper method for that in
the mean time: tabstractrecordsymtable.has_single_field()


Jonas

PS: that person also asked the same question on the forum
(https://forum.lazarus.freepascal.org/index.php?topic=50364)

PS2: the case Benito mentions is a different thing again. Managed
records can never be kept *only* in a register, because they need
initialisation and finalisation, which requires them to be in memory.
Caching individual fields of those locally in a register (while the
record itself remains in memory) would definitely require the general
optimisation I mentioned in the first paragraph.


More information about the fpc-devel mailing list