[fpc-devel] Patch, font rendering on Arm-Linux devices.
Christian Iversen
chrivers at iversen-net.dk
Fri Feb 29 15:35:38 CET 2008
Daniƫl Mantione wrote:
>
>
> Op Tue, 26 Feb 2008, schreef Luiz Americo Pereira Camara:
>
>> Yury Sidorov wrote:
>>> The patch removes packed record for some platforms.
>>> IMO packed can be removed for all platforms. It will gain some speed.
>>
>> I'd like to understand more this issue.
>> Why are non packed records faster?
>
> Cache trashing. One of the most underestimated performance killers in
> modern software.
>
>> The difference occurs at memory allocation or at memory access?
>
> Memory access. What happens is that the non-packed version causes more
> cache misses. A cache miss costs many cycles on a modern cpu, a
> misaligned read just costs an extra memory access (which is fast if
> cached) on x86, and extra load instruction on ARM. This much cheaper
> than a chache miss.
It's much worse than that. Some architectures simply _can't_ do
unaligned access, and they will trigger an exception.
This exception will in many configurations be caught by the OS, that
then might simulate the read by doing 2 reads, putting the result
together, writing into the application memory, and doing a task switch.
This, in total, is several _orders of magnitude_ worse than unaligned
access on a supported platform.
Of course, unaligned access in itself is pretty bad.
--
Med venlig hilsen
Christian Iversen
More information about the fpc-devel
mailing list