Re: [fpc-pascal] Reading text from images
ppkk at mail.ru
Wed Dec 13 13:20:30 CET 2006
1. As far as I know, there are no free pascal (or Freepascal) libraries working with PDF consistently.
However, open-source xPDF and Ghostscript (C/C++) do process PDF files (with some glitches, though).
"Full" PDF reference is available from Adobe, 1.7 (Acrobat 8) included. There are some omissions in the reference, but they are related to very uncommon protection methods only. Of course, some bugs should be present too.
2. Jpeg is a well known fully documented standard, Freepascal lists a package to work with Jpeg as a base one (it's included: the name is "pasjpeg"). Of course, there are some other open-source libraries.
3. OCR things are more vague, implementing them full-feathered from scratch is enormous task, if you want them to work good, because most methods are uncertain by their nature. But if you want to recognize simple typed text and know the used fonts and scale, the task becomes easier.
More information about the fpc-pascal