Wondering if there is an easy, efficient way to identify searchable PDFs as opposed to image-only PDFs. The only way I know to check for this is to open the PDF in Adobe and try a search. If it says 'not found' and I'm looking at the word in the document itself, I know it's just an image. Surely there is a better way to determine this?
I ask because we want to register/create searchable PDFs only in our EDRM/ECM system (HP TRIM). Right now our employees don't know the difference and it's a pain to replace PDF images created by other line-of-business systems/applications or scanners.
To make it even more convoluted, I use Acrobat Pro 9 but our employees use a mixture of Acrobat Pro versions as well as Reader 9.
Thanks in advance -- any and all suggestions will be greatly appreciated! Melinda
The Acrobat JS API Reference has an example for counting the number of words in a document. This code will only work on version 5 or above.
Will your system accept a page image with hidden text behind the page image?
George Kaiser