[fpc-devel] Patch, font rendering on Arm-Linux devices.
daniel.mantione at freepascal.org
Fri Feb 29 15:43:58 CET 2008
Op Fri, 29 Feb 2008, schreef Christian Iversen:
>> Memory access. What happens is that the non-packed version causes more
>> cache misses. A cache miss costs many cycles on a modern cpu, a misaligned
>> read just costs an extra memory access (which is fast if cached) on x86,
>> and extra load instruction on ARM. This much cheaper than a chache miss.
> It's much worse than that. Some architectures simply _can't_ do unaligned
> access, and they will trigger an exception.
> This exception will in many configurations be caught by the OS, that then
> might simulate the read by doing 2 reads, putting the result together,
> writing into the application memory, and doing a task switch.
> This, in total, is several _orders of magnitude_ worse than unaligned access
> on a supported platform.
> Of course, unaligned access in itself is pretty bad.
True, but irrelevant, because the discussion was under the assumption than
an unaligned read is done using the "unaligned" pseudo function. Unless
there is a bug in the compiler, the use of "unaligned" will never cause an
Instead "unaligned" will simulate an unaligned load with two loads and
some rotation etc. On the ARM, where every mnemonic can rotate operands,
this is isn't that bad of a penalty.
Therefore, I wouldn't be surprised that even on ARM, arrays with packed
structures are faster than arrays with unpacked structures.
More information about the fpc-devel