[fpc-pascal] Re: Text scan in text files - (was: Full text scan - PDF files)
md at delfire.net
Mon Nov 1 19:34:49 CET 2010
On Mon, Nov 1, 2010 at 3:31 PM, Tomas Hajny <XHajT03 at hajny.biz> wrote:
> On Mon, November 1, 2010 19:10, Marco van de Voort wrote:
>> In our previous episode, Marcos Douglas said:
>>> <albertonarduzzi at yahoo.com> wrote:
>>> >> Somebody can help me please?
>>> >> I need to search strings in Text files using just FPC.
>>> > how about reading every line and then using Pos() to see if some
>>> string is
>>> > there?
>>> I don't think this way is the fast way :(
>>> I have many PDF files with several pages each.
>> You'll be surprised. I've done multi million line logfiles that way. A
>> pdf2txt is infinitely slow compared with such processing.
> Well, there at least two gotchas there. First, it's better to use a
> reasonable (= large enough) buffer size. Second, the simplest approach
> implying reading line by line and searching using Pos() obviously isn't
> sufficient for searching across line breaks, i.e. you either need to
> handle that yourself, or use some unit providing such functionality.
Which unit do you recommends?
More information about the fpc-pascal