[fpc-devel] LLVM Backend?

Tue Nov 10 21:35:30 CET 2009

Hello Jonas,

(Replies inline.)

----- Original Message ----
> From: Jonas Maebe <jonas.maebe at elis.ugent.be>
> To: FPC developers' list <fpc-devel at lists.freepascal.org>
> Sent: Tue, November 10, 2009 1:57:03 PM
> Subject: Re: [fpc-devel] LLVM Backend?
>

-snip-

> The main things that basically halted my work on the LLVM backend are:
> a) FPC does not propagate type info down into the register level. There is no 
> notion even of pointers, everything is just an integer of a particular size once 
> it's inside a virtual register. The low level code generator routines are 
> completely oblivious to the symbol table and high level type system (which in 
> general is nice and helps modular design, but it's not nice in this case). It 
> would be possible to simply typecast these integers to i8*/i16*/i32*/i64*/... 
> when loading/storing a value and forgetting about it for the rest, but I'm not 
> sure to what extent that affect LLVM's ability to generate good code (and since 
> getting better code is the whole point...). It would also be possible to 
> implement the LLVM code generation at a higher level (bypassing the low level 
> code generator, and directly translating the parse tree nodes into LLVM byte 
> code), but that would both be a lot of work, and mean that any code generation 
> fix or addition would need to be implemented twice.
> b) the code generator currently does not support CPUs where the size of the 
> integer registers is > sizeof(pointer). I've worked a bit on trying to separate 
> the sizeof(pointer) from sizeof(integer registers), but it's not complete (the 
> {$ifdef cpu64bitaddr} vs {$ifdef cpu64bitalu} defines in the compiler).
> 

The current way to implement pointer arithmetic in LLVM is to use an i64 for all instances of pointers and then bitcast them to and from the correct pointer type when they are needed as actual pointers and then use the GetElementPointer operation to do the pointer-specific addressing modes.  If the 64-bit int is bigger than what is needed to hold a pointer and all uses are bitcast to pointers, the LLVM optimizer figures that out and down-converts it to 32-bits.  It's messy but it works.  There is a proposal to make a new primitive type to represent the pointer-sized integer being circulated on the LLVM-Developer mailing list right now so we'll see if anything comes of it.

Considering that LLVM generates code according to the specified alignment characteristics given on one of the first lines of the LLVM Assembly source file, it might be worth the great pains it takes to propogate types down the pipe a bit.  It may soon be possible to write a bitcode file that will compile down to all of the platform-specific formats from the generic format like Java or .NET does (except faster).

> Finally, there are also the i386 problems
> - for Delphi compatibility, FPC's default calling convention on all i386 targets 
> is "Borland fastcall", which is different from all other i386calling conventions 
> (see http://en.wikipedia.org/wiki/X86_calling_conventions#Borland_fastcall ). 
> And you'd be surprised how much code is out there that depends on this (even our 
> RTL won't work without this, a.o. the signal handling depends on this). So at 
> least for i386, LLVM would first need support for this calling convention.
> - I'm also not sure whether or not LLVM has support for the x87 extended 
> floating type in the mean time (80 bits floating point), another requirement for 
> our i386 port. Other cpu targets are more standard though, and should not pose 
> such problems.

LLVM supports custom calling conventions in the backend so I'd have to write a patch to the x86 backend to get it to take Borland fastcall convention.  According to http://llvm.org/docs/LangRef.html#t_floating the 80-bit x87 floating point type is supported but only on the respective platforms.  Normally, I think it is generated in SSE format using vector intrinsics.

Now I've got 2 questions for you:

Do the object-oriented features of FPC require name-demangled C bindings rather than the raw C++ name mangling techniques?  Reason:  If we can call C++ code directly we can invoke the LLVM code generation classes to go directly to bitcode rather than tinkering with the Assembly parser of LLVM.

Would it be too soon to announce that FPC will use LLVM in the future?  Reason:  If I can submit a patch directly to the LLVM repository for Borland fastcode calling convention on x86, I may be able to ask for help from the existing LLVM developers on their mailing list.

Thanks for your time and consideration,

--Sam Crow