These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

OCR/Indexing of PDFs

andysz
Registered: Feb 10 2010
Posts: 2

I have about 6000 pages of documents that are in PDF format, but not text searchable. All 6000 pages are in the same format containing certain information that I would like to easily index -- names, dates, times. My goal is to find a way to convert those into text searchable files and then index them into an Excel Spreadsheet or something similar. The end product would hopefully be a spreadsheet that contains pertinent information such as the names and dates from each document.

I am willing to throw a few bucks at the problem in new software, but if it gets too expensive, I may just outsource the job to a litigation support company. Would any of the Acrobat products alone or in combination with another program be able to automate this process for me?

Thanks,

Andy

My Product Information:
Reader 9.2, Windows
try67
Expert
Registered: Oct 30 2008
Posts: 2398
To do any sort of batch processing you will need Acrobat Pro.
OCRing the files with a batch process is easily done. There's a built-in command in the batch process window that does that.
Creating the index you describe is more complicated and will requires a custom-made script.

- AcrobatUsers Community Expert - Contact me personally at try6767 [at] gmail [dot] com
Check out my custom-made scripts website: http://try67.blogspot.com