[fpc-devel] Patch to speed up Uppercase/Lowercase functions
Michael Van Canneyt
michael at freepascal.org
Sat Jun 11 17:25:12 CEST 2005
On Sat, 11 Jun 2005, Daniël Mantione wrote:
> Op Sat, 11 Jun 2005, schreef Michael Van Canneyt:
> > On Sat, 11 Jun 2005, Joost van der Sluis wrote:
> > > If we're gonna hold a discussion like this for every optimilisation, it
> > > isn't worth the effort imho. But now we're busy with it:
> > >
> > > > Well. Discussion is nice, but what does the real world show ?
> > >
> > > > To compare, I made 6 versions of Lowercase:
> > > > 1 - Sysutils
> > > > 2 - Sysutils with Joost's improvement.
> > > > 3 - Sysutils with Joost's improvement, but forward loop.
> > > > 4 - Using PChar.
> > > > 5 - Using PChar with lookup table and if check
> > > > 5 - Using Pchar with lookup table and no check.
> > >
> > > You shoudn't use the sysutils's version. It's better to copy the source
> > > from sysutils to the testprogram.
> > >
> > > I've added that (lowercase1) and i've added Daniel's asm-procedure...
> > Timing for Daniel's procedure doesn't count, it's shortstrings.
> > They are limited to 256 characters, obviously this will execute faster;
> It is also a bad test, it should be converted to ansistring for proper
> comparison. The string s is declared in $H+ state, so the
> compiler has to convert an ansistring to shortstring, then do the
> lowercase and then convert the result back to ansistring to be able to
> assign to t. The fact that it is still faster shows what speed gains
> you can still get anno 2005.
I'm not contesting that; but here I think that the use of pascal
is more important, for porting's sake.
To show that there is room for improvement in FPC, these are the
Lowercase time to execute: 00:00:00.792
Lowercase2 Time to execute: 00:00:00.705
Lowercase3 Time to execute: 00:00:00.665
Lowercase4 Time to execute: 00:00:00.438
Lowercase5 Time to execute: 00:00:00.472
Lowercase6 Time to execute: 00:00:00.569
And their routine is pure pascal. (first, lowercase timing)
They are, without exception, much faster than FPC.
What is more, if I compile my test program with Optimizations (-OG2),
I get the following timings (times between brackets are without optimization):
Lowercase time to execute: 00:00:01.571
Lowercase2 Time to execute: 00:00:01.287 (00:00:01.362)
Lowercase3 Time to execute: 00:00:01.280 (00:00:01.395)
Lowercase4 Time to execute: 00:00:00.871 (00:00:00.996)
Lowercase5 Time to execute: 00:00:00.921 (00:00:01.064)
Lowercase6 Time to execute: 00:00:00.946 (00:00:00.953)
In which the version without lookup is actually faster than the
one with lookup. If compared with the timings I posted originally,
there is a 10% gain for Lowercase4, showing that optimizations are
Last, I inlined all functions (except the sysutils one):
Lowercase time to execute: 00:00:01.665
Lowercase2 Time to execute: 00:00:00.869
Lowercase3 Time to execute: 00:00:00.910
Lowercase4 Time to execute: 00:00:00.439
Lowercase5 Time to execute: 00:00:00.566
Lowercase6 Time to execute: 00:00:00.574
Now this is truly remarkable, because it comes closer to the Kylix timings.
Any explanation from the compiler people ?
> > The ansistring S used is 2400 characters (so a factor 10 longer).
> No, Joost's test uses a much shorter string (which is also the most likely
> reason the benchmarks give contradictory results).
My apologies, I (wrongly) assumed he adapted the program I posted and left S as-is.
In my opinion, all this shows that one should be careful with how
one designs tests :-)
More information about the fpc-devel