[fpc-pascal] Porter Stemming for FPC 2.0

Alan Mead cubrewer at yahoo.com
Mon Sep 19 17:40:57 CEST 2005


memsom <memsom at interalpha.co.uk> wrote:

> Reading PChars at negative indexes? Buffer underrun in other
> words... This
> is absolutely not a good thing. If FPC is preventing buffer under
> and
> overrruns, then it is actually right, for once, and Delphi is
> wrong,
> wrong, wrong!

Yeah, it seems dumb.  Well, I'm more a Turbo Pascal than a Delphi
programmer and not a software professional.  I don't actually know
what a pchar is...  I guess it's a pointer to a strong that was added
to Delphi to talk to the Windows API?

Anway, if you look at this guy's code, I'm convinced that he falls
into the "power user" category.  He has no problem writing ASM but he
also provides a a "pure Pascal" solution (chosen at compile-time). 
And he's apparently benchmarked his code and is agressively trying to
optimize it.

And it's not a buffer under-run per se.  He's checking the ends of
words against a series of word endings.  He calculates a negative
index when he checks a long ending like '-ization' against a short
word like 'word' ... he calculates that he has to start checking
character -3 of 'word' [length('word'-length('ization')] ... The
outcome of this checking is "true, the word ends in the ending" or
"false" and of course 99% of the time it's false.  I think it's
impossible that he could get a wrong result because even if the
garbage at memory p[-3] to p[-1] matches the word ending, the word
itself will not.

So, I mention all this because it is an obscure point of
incompatibility between FPC 2.0 and Delphi 5-7 (and FPC 1.x) ... In
my case, this code worked fine and then it broke... just turning off
range-checking isn't an answer for me, as I need $R+ to catch my own
errors.  Luckily, it was easy enough to wrap IF statements around
these bits.

> A question... how do you know the memory at the negative index is
> valid?

I've explained why garbage won't goose the algorithm.  As to why it
does not GPF, I suppose this pchar is always pointed into the
"middle" of the data segment and the negative indices are always
single digits.  In the zip file containing my fix, I included the
test program that drives this guy's unit and the test data.  It
compiles and runs fine in Delphi 7 (you may have to comment out the
"{$mode DELPHI}") on the test data.

-Alan



More information about the fpc-pascal mailing list