These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Which PDFs are already OCRed ?

sandraqu
Registered: Jul 19 2007
Posts: 4

If I have received a bunch of PDFs. How can I tell which ones are already OCRed without opening the file?

--sandra

Sean76
Registered: Jul 9 2007
Posts: 10
You can create a Batch Process and use the accessibility checker.
In Acro 8 Pro the steps are: Advanced -> Document Processing -> Batch Processing -> New Sequence -> Type "accessibility checker" -> Select Command -> Double click Accessibility Checker.Set the Batch up to read the files or folder of files you are interested in.

When the file result says:
'All of the text in this document lacks a language specification'
then there is text that is recognized in the document.

When the file result says:
'This document contains no fonts. If this document appears to contain text, it may be an image-only PDF file'
then the file has no recognized text.

Hope this is helpful.
sandraqu
Registered: Jul 19 2007
Posts: 4
That's great!

Will Acro 8 run through and recognize PDF vs. OCR saved as older versions? 7.0, 6.x, 5.x., 4.x, etc. ?

--Sandra
gkaiseril
Expert
Registered: Feb 23 2006
Posts: 4307
Yes, but instead of creating a Batch Process that requires interaction for every file read, I would count the words in the PDF and either create a report of the PDFs with zero words or print the list to the JS Console.

George Kaiser