[fpc-devel] BlackFin
Michael Schnell
mschnell at lumino.de
Mon Apr 16 11:57:07 CEST 2007
> r2 = r1 + r3, r4 = dm(i0,m1); /* addition and memory access */
>
Yep. In my answer to Florian I forgot that (other than ARM) the Blackfin
can do a calculation and a memory access in a single instruction cycle.
That explains the much better performance even with standard
(non-DSP-alike) tasks.
> r3 = r2 * r4, r1 = r2 + r4; /* multiplication and addition */
>
I did not know yet that it can do two independent 32 bit calculations
and that it can do 32 bit multiplications. Anyway, even if only two 32
additions can be done in one instruction cycle this is a big chance for
optimization.
> A totally different topic is the inherent parallel processing of a DSP.
> Usually they can utilize several processing units (+, *) and memories
> within a single cycle (e.g. see above). Instruction ordering and
> interleaving to utilize parallelism is tedious to do by hand and I think
> also challenging for a compiler.
>
Maybe a first version could skip the great chances for optimization and
just do a single operation per instruction cycle.
It should be able to create a working compiler that way.
-Michael
More information about the fpc-devel
mailing list