These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Distinguishing PDF file content

2010-11-04 12:36:06

irap

Registered: Aug 6 2008

Posts: 56

We're trying to figure out how to distinguish between the content of different PDFs.

We've identified:
1. PDFs created with a "distiller" of some kind.
2. PDFs that are image only.
3. PDFs that are images with OCR text.

Depending on the type of PDF we want to route the files through different parts of our production system.

Presumably we'll need to look at the internals of the PDFs to figure this out.

Thanks.

Ira

2010-11-04 13:06:09

UVSAR

Registered: Oct 29 2008

Posts: 1357

Are you trying to do this programmatically or visually?

If the latter, Document properties will show you what made the file, and text that's OCRed can be selected with the multi-select tool whereas unprocessed images of text can't.

2010-11-04 13:08:14

irap

Registered: Aug 6 2008

Posts: 56

Programmatically

2010-11-05 07:40:25

mattbeals

Registered: May 10 2007

Posts: 40

There are tools like Enfocus Switch that can route files based on XMP, file names, file types and other parameters.

2010-11-09 08:51:48

irap

Registered: Aug 6 2008

Posts: 56

the only thing we'll know is that it is a PDF file.