[fpc-pascal] fast text processing

L L at z505.com
Wed Oct 31 05:43:05 CET 2007


> Hi all,
>
> I had never used Perl before. Until someone showed me Perl is very fast
> for text processing (using its powerful regex), despite it's an
> interpreted language. It even beat Delphi and FPC though both are
> compiled language. A few lines Perl program almost two times faster than
> a few pages pascal program.
>
> How can an interpreted language is faster than a compiled language (I
> assume both has been optimized well and using the best algorithm)? Can
> FPC improve its text processing related classes/units, so they can be
> faster than Perl (as it should be, logically)? Especially the TStrings
> class.


Give us a test case (some example source code) and I will beat the living crap
out of any perl script. Perl is built using Cee, so anything Perl can do Cee can
do better.. which means Pascal can do better or similar. Perl is not written in
Perl. In other words, perl is just a wrapper around the Cee language.. a similar
effect can be done by making terse procedural wrappers around verbose pascal
units.

For the number of lines it takes to write perl code: the same nonsense goes on
with ruby. You can write a lot of code on one line but doesn't matter that much
if you can't go in and extend the code. Take a look at the Mediawiki parser, and
compare it to the parser I have built in Simple Wiki. PHP offers lots of syntax
shortcuts too but you cannot maintain the mediawiki parser like you can a
beautiful case statement.  Take a look at smarty source code.. lots of syntax
shortcuts, horrible crap. I'm serious when I say take a look.. you have to look
to know.

Build wrappers in Pascal and you'll have shorter/terser code.. for example  It
takes a lot of code to get a file into a string in Pascal.. lots of readlns or
tstreams and create/free noise.. but if you use StrLoadFile() from my units it
loads it instantly without any class instantation or without any Readln. Using
the stack without any free/create can also tersen code a lot.

It depends for what you are doing and what source code was used.. many times
ansistrings are very slow if the person did not use a buffer and had lots of
concats.

Without any code showing what it is that is 'slow' in pascal, your post is
meaningless.. ;-)

Give me/us a test case that you want to process/parse and I'll beat the crap out
of it with real benchmarks showing the results.  I parse over 15,000 HTML files
all the time and I highly doubt some of my parsers could be beaten by perl...
and I build wrappers around more wrappers around more wrappers if I want to
tersen the code more and more. Most of the units out there for pascal are not
wrapped enough.. this is why it takes so many lines of code in Pascal.

Example: synapse. With my wrapper?

s:= GetHtm('http://somesite');
parse(s);


With regular synapse and regular typical VCL style code (no offence)?

var
  buf: string

begin
 blah:= TSynapse.Create;
 blah.setupthis;
 blah.setthis:= 'blah';
 blah.method:= 'GET';
 buf:= blah.strings.text;
 blah.execute;
 blah.free;
 blah:= nil;

 SomeParser.create;
 SomeParser.setupthis;
 SomeParser.string:= buf;
 SomeParser.Exec;
 SomeParser.free;
 SomeParser:= nil;
end;


It's all about wrappers. Perl is just a wrapper. Pascal can do wrappers too.
That's why webwrite() is so easy to use, for example. Encapsulation even works
well even with procedures, not just objects. Perl contains a lot of quick and
dirty procedures and syntax that are mapped to Cee procedures. That's all.

L505




More information about the fpc-pascal mailing list