[fpc-pascal] for loops performance problems?
Karoly Balogh (Charlie/SGR)
charlie at scenergy.dfmk.hu
Wed Jul 5 23:02:23 CEST 2017
On Wed, 5 Jul 2017, Anthony Walter wrote:
> I replaced the calls to World.Vertex/.TexCoord/.Color with a local
> vertex buffer (an array of TColorTexVertex) eliminating the function
> calls you mentioned. The frames per seconds with vsync off is identical,
> so I'm pretty sure that's not causing the slow down. It's either that
> the addition/multiplication of floats given the font map (heights/widths
> stored in an array) is inefficient or that there is something about the
> nature of a for..loop that is causing it to be slow.
If you still think that loop causes the slowdown, can you post the
generated assembly of it with -al? Otherwise it's really just guesswork.
Also, since you're compiling for ARM if I'm correct, make sure that you
A., using the hardfloat target, and not actually using the softfpu...
B., your data strutures are properly aligned, and any underlying records
are *NOT* declared as packed.
C., you're actually doing aligned accesses indeed, so there are no hidden
exceptions involved from the kernel side, handling the load/store of your
The other example which bubles up again and again, is down to the fact,
that FPC doesn't do autovectorization of that example, while other
compilers, mainly LLVM does. With scalar code, FPC is not that far behind,
if at all.
More information about the fpc-pascal