[fpc-pascal] Re: Text scan in text files - (was: Full text scan - PDF files)

Tomas Hajny XHajT03 at hajny.biz
Mon Nov 1 19:50:06 CET 2010


On Mon, November 1, 2010 19:42, Marcos Douglas wrote:
> On Mon, Nov 1, 2010 at 3:36 PM, Tomas Hajny <XHajT03 at hajny.biz> wrote:
>> On Mon, November 1, 2010 19:20, Marcos Douglas wrote:
>>> On Mon, Nov 1, 2010 at 3:10 PM, Marco van de Voort <marcov at stack.nl>
>>> wrote:
>>>> In our previous episode, Marcos Douglas said:
>>>>> <albertonarduzzi at yahoo.com> wrote:
>>>>> >> Somebody can help me please?
>>>>> >> I need to search strings in Text files using just FPC.
>>>>> >
>>>>> > how about reading every line and then using Pos() to see if some
>>>>> string is
>>>>> > there?
>>>>> >
>>>>>
>>>>> I don't think this way is the fast way   :(
>>>>> I have many PDF files with several pages each.
>>>>
>>>> You'll be surprised. I've done multi million line logfiles that way. A
>>>> pdf2txt is infinitely slow compared with such processing.
>>>
>>> Ok, but I need to work with pure text. The PDF files are "compiled".
>>
>> Sorry, I'm confused now. Do you want to search for text strings within
>> PDF
>> files or within regular text files?
>
> I need to search for text strings in PDF files... but I don´t know do
> that, so I use the pdf2txt[1] to convert in pure text.
> If there is a unit that can search for text within a PDF file, would
> be the best solution for me.

I see, but that means that Marco's comment applies - conversion of PDF
files to pure text will probably require more time than finding your
string within the (converted) text file.

Tomas





More information about the fpc-pascal mailing list