[fpc-devel] CMem allocator memory alignment experiment

Karoly Balogh (Charlie/SGR) charlie at scenergy.dfmk.hu
Wed Nov 19 11:49:39 CET 2014


Hi,

On Wed, 19 Nov 2014, Mark Morgan Lloyd wrote:

> > At least on Linux, malloc() is documented to align to 64 bit on 32 bit and
> > 128 bit on 64 bit platforms, while this way cmem's GetMem() reduces that
> > to 4 bytes and 8 bytes, respectively.
>
> Since cmem is intended for use by FPC, I don't see this as a serious issue
> unless somebody is exchanging malloc()ed blocks between Pascal and C code.

Since the RTL's allocator is documented to align to 16 bytes, it's
definitely an issue also with Pascal code. We do have code where also
Pascal side triggers the aligment issue, but indeed, the main issue is
with linked C libs, depending on pointers from the Pascal side.

> > This causes multiple performance and other issues, especially on
> > processors which require stricter alignment (most ARM CPUs, but also x86
> > with SSE, etc).
>
> I'm not sure to what extent this remains an issue with current ARM chips. I've
> got limited ARM hardware, but some tests that I did with somebody else a few
> months ago didn't show up any issues.

We do have limited ARM hardware, based on older ARM cores where this is an
issue. We use FPC in production on these chips, so it's an issue for us.
And since these cores remain in production for the coming years (not just
for us, but in general), the compiler and libs have to deal with it.

> Perhaps the most serious scenario is where an architecture or particular
> implementation requires alignment, but the kernel traps alignment errors and
> fixes them silently. SPARC Solaris does this and my understanding is that it
> introduces a very significant performance overhead;

Linux also does this. Actually there's plenty of hardware, where this is
an issue. Almost all "RISC" chips, especially embedded ones do have
alignment restrictions to some degree. I know older PPC and recent Power
chips having them as well. And even the fastest CPUs have some performance
penalty when doing unaligned accessess even if the hardware solves it
itself, and it doesn't involve the software side.

> ARM Linux may also do it (where demanded by the hardware) but my
> understanding is that notifications can be enabled.

Yes, we have these notifications enabled, and we're flooded with them,
when using the cmem allocator. This is why I started working on this.

Charlie



More information about the fpc-devel mailing list