[fpc-devel] Successful implementation of inline support forpureassembler routines on x86
J. Gareth Moreton
gareth at moreton-family.com
Mon Mar 18 00:28:14 CET 2019
Speaking of POPCNT, there's an interesting webpage here regarding assembly
v. intrinsics in C++: https://danluu.com/assembly-intrinsics/
One point that's been raised is that on older Intel processors, POPCNT has
a bug in that it has a false dependency on the destination register, so it
will cause a pipeline stall. Though it's a bit of an exceptional case, it
shows where inline assembly has an advantage over intrinsics because you
have finer control over how temporary values are stored and also the order
in which instructions appear in order to handle issues like that with
I do agree that you should program in Pascal whenever you can for
readability and portability, and I will certainly continue to improve the
x86 peephole optimizer in that regard, but there will always be cases where
assembly language and even inlined assembly language will have a place. I
say give people the choice.
True, implementing it on other platforms will take time and will need a
programmer with intricate knowledge of that platform's instruction set (I
only have access to x86 machines currently), but it's not a super-critical
task and doesn't actually break the compiler by not being present. Still,
when the day comes that I get an ARM-powered Raspberry Pi, I'll happily
start researching its assembly language to improve FPC on there as well,
especially if people start demanding it. I'm under the impression that,
generally, i386 and x86_64 are two of the most popular platforms when it
comes to development of FPC projects, so it makes sense to target those
Gareth aka. Kit
On Sun 17/03/19 20:58 , Florian Klämpfl florian at freepascal.org sent:
Am 17.03.19 um 21:47 schrieb Martok:
> Am 17.03.2019 um 18:57 schrieb Florian Klämpfl:
>> How is it better than intrinsics support (similiar to gcc/icc etc.)?
> It *exists*?
> Remember how long it took to get PopCnt support?
PopCnt is not really an intrinsic as it has a fallback counter part and
works on all platforms. Intrinsic means that it is really mapped
directly to the CPU instruction without any fallbacks.
As the branch of Jeppe shows, it is pretty easy, just requires some
> How about the rest of the BMI? > TBM? AES-NI? Newer AVX?
fpc-devel maillist - fpc-devel at lists.freepascal.org 
 mailto:fpc-devel at lists.freepascal.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the fpc-devel