I have about 6000 pages of documents that are in PDF format, but not text searchable. All 6000 pages are in the same format containing certain information that I would like to easily index -- names, dates, times. My goal is to find a way to convert those into text searchable files and then index them into an Excel Spreadsheet or something similar. The end product would hopefully be a spreadsheet that contains pertinent information such as the names and dates from each document.
I am willing to throw a few bucks at the problem in new software, but if it gets too expensive, I may just outsource the job to a litigation support company. Would any of the Acrobat products alone or in combination with another program be able to automate this process for me?
Thanks,
Andy
OCRing the files with a batch process is easily done. There's a built-in command in the batch process window that does that.
Creating the index you describe is more complicated and will requires a custom-made script.
- AcrobatUsers Community Expert - Contact me personally at try6767 [at] gmail [dot] com
Check out my custom-made scripts website: http://try67.blogspot.com