[fpc-pascal] FPC Graphics options?

Fri May 19 23:07:45 CEST 2017


On 05/19/2017 11:24 PM, Sven Barth via fpc-pascal wrote:
> On 19.05.2017 19:22, Karoly Balogh (Charlie/SGR) wrote:
>> Hi,
>>
>> On Fri, 19 May 2017, Sven Barth via fpc-pascal wrote:
>>
>>> I think Jeppe wanted to add vector support. Though the question here is
>>> whether one wants to optimize/detect this at the AST level and convert
>>> that to implicit vectors or at the CSE level.
>> I think the higher level you can do an optimization/simplification, the
>> higher you should do it. Otherwise the lower layers get really messy, as
>> they already are in some cases. Well, in general, we should up our
>> floating point game. For example if Nikolay's recent load-modify-store
>> optimization would work on floats, that would already a nice step forward
>> in this case. ;) (Sorry for my ignorance, if it already works, missed that
>> then.)
No, it does not work for floats, yet, but feel free to add support for 
them as well :)
> I agree that we should improve that. Maybe we should also allow for more
> FPU type specific helper routines. Currently on i386 and x86_64 the x87
> FPU will be used even if -CfsseX is given and only Single/Double are
> used, cause the compiler defaults to Extended. If SSE isn't used that
> might make sense, but for SSE we should fall back to Double if we're
> only dealing with double, IMHO (and analogous for Single).
>
>>> By the way: I think my commit today of a SSE Frac() implementation sped
>>> up the framerate by a third on Win64 compared to the one without it :D
>> Cool, but shouldn't this be an inline node instead for real speed++...? ;)
>> I mean if Trunc() and Round() are...
> Ah, right, hadn't seen that we do indeed have an inline node
> implementation for x86. I should probably put that on the list then :D
Yes, we do. And we can, in fact, make similar ones for many routines in 
the math unit as well. In fact, it is on my todo list, but feel free to 
start working on it, if you have time, since I have also other things to 
do and I don't know when I'm going to even start this one :) Btw, the 
sincos() routine is also a good candidate for inlining, and so are the 
divmod routines and the min/max routines (they are a good candidate for 
using the cmov instruction on i686+). When we have these as inline, we 
can then even add optimization passes that convert calls to sin(x) and 
cos(x) that are close to each other with the same parameter and no side 
effects between them to sincos(), same for div and mod -> divmod, etc.

Nikolay