[fpc-devel] Nested functions in numlib
Werner Pamler
werner.pamler at freenet.de
Tue Apr 4 17:33:41 CEST 2017
Am 04.04.2017 um 03:28 schrieb Marco van de Voort:
> Did you test performance? Repeated access to parent frame in tight loops
> might be suboptimal. Could maybe be helped with some pointer work?
Right, I should have done that before asking...
Here are the results of a test running the original roof1r routine (A),
the modified one using the nested function (B) and other modified one
using a non-nested function but calling the version with the nested
function (C). In each case, several functions are passed to the root
finder which is called 5 million times, each call with a (reproducibly)
different parameter:
f(x) = x
(A) ORIGINAL version: 0.656s for 5000000
runs (check: y = 0.00000000)
(B) NESTED version: 0.703s for 5000000
runs (7%)
(C) Global function calling nested function: 0.735s for 5000000
runs (12%)
f(x) = x^2
ORIGINAL version: 6.296s for 5000000
runs (check: y = 0.00000000)
NESTED version: 6.313s for 5000000
runs (0%)
Global function calling nested function: 6.546s for 5000000
runs (4%)
f(x) = exp(x)
ORIGINAL version: 6.734s for 5000000
runs (check: y = 0.00000000)
NESTED version: 6.703s for 5000000
runs (0%)
Global function calling nested function: 6.890s for 5000000
runs (2%)
f(x) = arcsin(x)
ORIGINAL version: 5.718s for 5000000
runs (check: y = 0.00000000)
NESTED version: 5.718s for 5000000
runs (0%)
Global function calling nested function: 5.937s for 5000000
runs (4%)
f(x) = erf(x)
ORIGINAL version: 6.391s for 5000000
runs (check: y = 0.00000000)
NESTED version: 6.422s for 5000000
runs (0%)
Global function calling nested function: 6.673s for 5000000
runs (4%)
f(x) = gammaLn(x)
ORIGINAL version: 15.260s for 5000000
runs (check: y = 0.00000000)
NESTED version: 15.142s for 5000000
runs (-1%)
Global function calling nested function: 15.426s for 5000000
runs (1%)
I would interpret these results such that there are no dramatic
slow-downs due to calling variant C. Variant B (nested funtion) is
roughly the same speed as the original procedure.
More information about the fpc-devel
mailing list