[fpc-pascal] Speed

bartek bbartek at gmx.net
Tue Oct 30 23:07:01 CET 2007


On Tuesday 30 October 2007 22:07:21 Valdas Jankūnas wrote:
> L rašė:
> >>>>> Here's one: profile your code!
> >>>>
> >>>> And where can you read how to do that?
> >
> > VALGRIND
>
>   I processed my app trough "valgrind -v --tool=callgrind ./my_app",
> opened generated report with Kcachegrid and viewed how many times called
> some procedures and etc.
>   Sorry for stupid question: i have no idea how i can check with
> Valgrind how one coding style is faster than another (see root post)?
>
>   One comes in mind- call testing function many times and measure with
> GetTickCount how long takes this process...

KCachegrind should report the cumulative and median times for your functions. 
At least it does for me.

> Now i write program, that program made extensive calculations. And i
> want write fast code. Question about executing speed:
>   first code-
>
>
> function Calculate1: Extended;
>    function SubCalculate: Extended;
>    begin
>      ...
>    end;
> begin
>    ...
>    ..:=SubCalculate;
>    ...
> end;
>
>
>   second code-
>
>
> function SubCalculate: Extended;
> begin
>    ...
> end;
>
> function Calculate2: Extended;
> begin
>    ...
>    ..:=SubCalculate;
>    ...
> end;
>
>
>   I think first code is faster than second, because in first code
> SubCalculate function is in calling function body?

IIRC, does not matter. Only the scope of the function is different. If you are 
that concerned over the call-overhead of your function, you can inline it.
E.g
<Code>
function SubCalculate: Extended; inline;
begin
end;
</Code>

>
> P.S. Where i can read tips about writing fast FP code?

FP is no Java. You get what you write. If you don't use virtual functions in 
OO-code, there are no hidden performance killers.
First identify the bottleneck in your program (SubCalculate) with valgrind 
(Which you did). Now you can check this code by looking at the assembler 
output of the fp compiler. "fpc -alrnt ..." You can use this a base for more 
handoptimizing assembler code.
As a side note. I see you are using extended. Newer x86 CPUs support the 
SSE1/2/3/4 extensions which can process multiple doubles at mind-boggling 
speed. If you can, you should consider switching to double and then write SEE 
code if you are using a x86 CPU.

You *can* write Code with FP that rivals C.

bartek



More information about the fpc-pascal mailing list