These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Searching PDFs using index in a corporate environment

dwhilsdorf
Registered: Dec 15 2008
Posts: 3

My company is considering investing in a replacement for our antiquated SoftSolutions program, which provides cataloging, full text search, etc.

I'm considering indexing our ~7G of drawing data, mostly PDF files, and using Acrobat Reader to search.

Trying to identify the 'gotchyas' before promoting this too much. Already ran into resistance from MIS and want to be prepared.

Has anyone done this successfully, on a larger scale? What were the major issues and how were they overcome?

My Product Information:
Acrobat Standard 5.x or older, Windows
daka630
Expert
Registered: Mar 1 2007
Posts: 1420
Some observations -

A bin of ~7G will take a *long* time to index.
Each time a file is updated (new drawing revision) the rebuild of the index will take a *long* time.
One large bin of files can result in less than timely location of specifics by end-users.
Does each PDF have useful, meaningful metadata loaded (useful - meaningful to the various end-users - *not* "MIS" or those who are not primary users)?
Are all drawings from scanned images OCR'd? Cannot index a image, eh.
fwiw, Adobe's OCR engine will not OCR a page in a PDF that has renderable text.
Thus, a raster/vector blend on a page leaves whatever is in the raster out of the index.
Can the collection be organized in logical groupings?
Electrical | Mechanical | Fluid | System(s) | ISOs etc.?
If I want to focus on a specific system's piping and instrumentation drawings can I go directly to the sub-set, get what I need & move on (quickly)?
Or do I have to wade through all the other "stuff" to, maybe, find & look over what I need?Smaller sub-collections are easier to maintain & thus, more sustainable by those who do the actual "maintenance".
Each sub-collection can have its own cataloged index. "Super" indexes can be provided for logical aggregations of file categories.
More upfront effort but more sustainable, flexible, usable, etc. every day after this.

You wrote "Mostly PDFs...", I believe that Acrobat's catalog function will only work on PDF files.

Been "hands-on" with a much larger legacy correspondance PDF collection and, although smaller in total size, with other PDF collections.
With the appropriate Adobe applications and the understanding of how to "2 step" with them a rather small number of individuals can process and publish well-formed, interactive PDF collections with cataloged indexes to network space, web space or removable media.
The result is that specific information acquisition becomes easier than falling off a log out in the river.
Throw in that the Acrobat Reader is free to use...

Be well...

Be well...