[fpc-pascal] PDF indexing

Marc Santhoff M.Santhoff at web.de
Wed Jun 24 01:04:46 CEST 2015


On Di, 2015-06-23 at 09:10 +0200, Michael Van Canneyt wrote:
> 
> On Tue, 23 Jun 2015, Marc Santhoff wrote:
> 
> > On So, 2015-06-21 at 00:33 +0200, Michael Van Canneyt wrote:
> >>
> >> On Sat, 20 Jun 2015, Marc Santhoff wrote:
> >>
> >>> Hi,
> >>>
> >>> does fpc (or lazarus) have a helper class for indexing the content of
> >>> PDF files?
> >>
> >> check packages/fpindexer
> >>
> >> I have used it to create full text searches on a database.
> >> You should be able to adapt the base code to create an index of a PDF.
> >
> > That looks pretty intresting. And it has some docs, wow.
> >
> > If I understand correctly I'd only have to implement a class TIReaderPDF
> > and the difference to simple text reading is the part that extracts a
> > text stream or the text parts of the stream rejecting the pdf commands
> > (if they are in there, need to look at PowerPDF).
> 
> Yes, that would be correct.

Many thanks, Michael.

Currently I'm searching a pdf access library that could help doing so.
The only one halfway fitting up to now is this one:

http://itextpdf.com/functionality

Open Source but a license similar to LGPL without exception. Still
searching ...

Marc

-- 
Marc Santhoff <M.Santhoff at web.de>




More information about the fpc-pascal mailing list