[fpc-devel] Patch, font rendering on Arm-Linux devices.
jura at cp-lab.com
Fri Feb 29 17:05:39 CET 2008
From: "Daniël Mantione" <daniel.mantione at freepascal.org>
>>> Instead "unaligned" will simulate an unaligned load with two loads
>>> and some
>>> rotation etc. On the ARM, where every mnemonic can rotate
>>> operands, this is
>>> isn't that bad of a penalty.
>>> Therefore, I wouldn't be surprised that even on ARM, arrays with
>>> structures are faster than arrays with unpacked structures.
>> That's possible. Why would it be faster, btw? Better cache
>Like I mentioned, unliek modern x86 processors, ARM processors cannot
>detect an array traversal and preload the array into the cache. If
>array is not in cache, you get cache miss after cache miss.
>A cache miss is very expensive with latencies of modern memory. A
>array results in less cache misses.
I run my benchmark on ARM mobile and got the following results:
2080ms - for non-packed
4450ms - for packed
It clearly shows that ualigned access kills performance on ARM...
More information about the fpc-devel