[fpc-pascal] FPC Graphics options?

Fri May 19 13:17:05 CEST 2017


On 05/19/2017 02:11 PM, Nikolay Nikolov wrote:
>
>
> On 05/19/2017 03:54 AM, Ryan Joseph wrote:
>>> On May 18, 2017, at 10:40 PM, Jon Foster 
>>> <jon-lists at jfpossibilities.com> wrote:
>>>
>>> 62.44      1.33     1.33 fpc_frac_real
>>> 26.76      1.90     0.57 MATH_$$_FLOOR$EXTENDED$$LONGINT
>>> 10.33      2.12     0.22 FPC_DIV_INT64
>> Thanks for profiling this.
>>
>> Floor is there as I expected and 26% is pretty extreme but the others 
>> are floating point division? How does Java handle this so much better 
>> than FPC and what are the work arounds? Just curious. As it stands I 
>> can only reason that I need to avoid dividing floats in FPC like the 
>> plague.
> Java is a JVM, which generates bytecode, which isn't CPU specific and 
> comes with a JIT compiler, which compiles the bytecode to native code, 
> when the program is run, so it can always make use of the instruction 
> set, supported by the CPU you're using. But, of course, launching the 
> application becomes much slower. In FPC, if you want to use SSE and 
> avoid the x87 FPU, you have to compile with a specific compiler 
> options and forfeit the option for your executable to run on non-SSE 
> capable CPUs, because FPC generates native code. If you want to keep 
> compatibility and support modern instruction set extensions, you need 
> to compile different executables for different instruction sets and 
> make a launcher .exe, which detects the CPU type and runs the 
> appropriate executable. The default options for the i386 compiler is 
> to target the Pentium CPU, which does not have SSE. This gives most 
> compatibility and least performance, but that's what's appropriate for 
> most users, because for most desktop applications, CPU speed is no 
> longer an issue. Only very specific tasks, such as software 3D 
> rendering need high CPU performance, and people doing that stuff, 
> usually know very well their compiler options and how to enable 
> support for modern instruction extensions for maximum performance. Of 
> course, people coming from a Java background might not be used at all 
> to having to do this kind of stuff, but it's really not that hard.
With all that said, I'm not saying that FPC still doesn't have room for 
optimization, only the difference shown shouldn't be this huge, if you 
use the capabilities of modern CPUs. fpc_frac_real is slow on modern 
CPUs, because it uses slow x87 code, instead of SSE. FPC_DIV_INT64 is 
slow, because it does 64-bit division on 32-bit CPUs, using an algorithm 
that does use only 32-bit instructions. The fact that this procedure is 
a bottleneck in your code means that your code will benefit immensely if 
compiled for x86_64, which has a native 64-bit division instruction.

Nikolay