[fpc-pascal] Searching Text Files

Martin Collins mailinglists at collins-email.co.uk
Wed Mar 19 00:19:06 CET 2014


Hi,

I'm writing a little personal program in Lazarus that manages pdf files. 
One of the things I want to do is search for text/phrases within the 
pdfs. Has anybody tried to do this before and if so what is the best 
(easiest) way you've come across?

I've detailed what I've been doing below, but this is for background 
information, as after messing about with it for a couple of days I am 
not so sure this is the most sensible way to go about this even if I can 
get it work. The awk count command detailed below was just me trying out 
a proof of concept and for the real search I was planning on it being 
slightly more sophisticated, but failed at the first hurdle!

I will appreciate your opinions and experiences please. Many thanks.

Martin Collins

Free Pascal Compiler version 2.6.2-5 [2013/07/25] for x86_64
Lazurus SVN 1.3
Awk - GNU Awk 4.0.1


I'm using Linux and have access to all the opensource goodies that 
offers. I Googled for a pure pascal solution and did not find anything. 
So over the last couple of days I been experimenting with pdftotext and 
then awk on the text files, both executed through TProcess.

Working on the bash command line awk is fine but it seems to play up 
when executed through TProcess. I think it's an awk (or stupid me) 
problem rather than a TProcess (note: I am an awk novice and not an 
experienced programmer in general!).

The bash command line awk instruction (to count the number of search 
string instances) -

awk '$1 ~ /searchstring/ {++c} END {print c}' FS=: textfile.txt

In a simple pascal program to replicate the above, this works;

     ...
     aString := 'awk ''$1 ~ /searchstring/ {++c} END {print c}'' FS=: 
textfile.txt';
     AProcess.CommandLine := aString;
     AProcess.Execute;
     ...

but requires return to be pressed before the program exits and 
'CommandLine' is depreciated!

This fails;

     AProcess.Executable := 'awk';
     aString := '''$1 ~ /searchstring/ {++c} END {print c}'' FS=: 
textfile.txt';
     AProcess.Parameters.Text := aString; // same with Parameters.Add
     AProcess.Execute;
     ...

with the error, awk: cmd. line:1: ^ invalid char ''' in expression




More information about the fpc-pascal mailing list