These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Distinguishing PDF file content

irap
Registered: Aug 6 2008
Posts: 56

We're trying to figure out how to distinguish between the content of different PDFs.
 
We've identified:
1. PDFs created with a "distiller" of some kind.
2. PDFs that are image only.
3. PDFs that are images with OCR text.
 
Depending on the type of PDF we want to route the files through different parts of our production system.
 
Presumably we'll need to look at the internals of the PDFs to figure this out.
 
Thanks.
 
Ira

UVSAR
Expert
Registered: Oct 29 2008
Posts: 1357
Are you trying to do this programmatically or visually?

If the latter, Document properties will show you what made the file, and text that's OCRed can be selected with the multi-select tool whereas unprocessed images of text can't.
irap
Registered: Aug 6 2008
Posts: 56
Programmatically
mattbeals
Registered: May 10 2007
Posts: 40
There are tools like Enfocus Switch that can route files based on XMP, file names, file types and other parameters.
irap
Registered: Aug 6 2008
Posts: 56
the only thing we'll know is that it is a PDF file.