These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

OCR and creating editable text

hotwhiteheat
Registered: Jan 20 2010
Posts: 5

I have been assigned the task of making scanned .pdf documents (The files are instruction booklets for science experiments) into editable documents so that changes can be updated quickly. The person I am doing this for currently types new text and then prints it out, manually places the cut-out over the old text and then rescans it.

The documents are already scanned and are .pdf files. There are about 200 pdf files so I need to do this in batch. I have been able to run the OCR and get a bookmark that contains the extracted text, but then I'm stuck with manually reformatting each booklet and also losing the illustrations...or are they?

If someone could point me in a direction it would be very helpful.
thanks.

I am using Adobe Professional v9.0

try67
Expert
Registered: Oct 30 2008
Posts: 2399
Sounds like a workflow from hell. Why are you using PDFs if the files still need to be edited?
PDF should be the LAST format that you use in your workflow.

- AcrobatUsers Community Expert - Contact me personally at try6767 [at] gmail [dot] com
Check out my custom-made scripts website: http://try67.blogspot.com

hotwhiteheat
Registered: Jan 20 2010
Posts: 5
Editing these files will be an ongoing process. The files were given to me as pdf's
rbogie
Registered: Apr 28 2008
Posts: 432
Quote:
The person I am doing this for currently types new text and then prints it out, manually places the cut-out over the old text and then rescans it.
you need a lot more help than what anyone can provide on this forum. but this should help you to go in the right direction:

step one: convert the PDFs to TIFs.

step two: edit the TIFs with MS Office Document Imaging (suggested because MODI comes bundled with MS office) See MODI's tools menu; pay particular attention to tools > annotations and tools> options. Edit (using text box annotation tool) a test document, make the annotation permanent. print to any pdf printer.
Quote:
There are about 200 pdf files so I need to do this in batch.
good luck !
hotwhiteheat
Registered: Jan 20 2010
Posts: 5
I do have experience as an application engineer for a software co. working engineering product configurators, OCR for government projects, creating social networks for enterprise solutions, etc., but this project is a first form me; I was more of a designer than a programmer but I can think like them and want to take on the challenge. I am trying to determine if I can get this done or if i should get some help or subcontract someone. I am just getting started with some freelance work and this a potential client if I can make this happen.

Few things in regard to your last response: I am using a mac so I am not running windows. Why use MS Office Document imaging? Is it unrealistic to think I can do this using Mac OS Automator, Apple Scripts and Adobe Acrobat professional for the extraction and then populate a new template using InDesign, for example? What is the reason for starting with .tiffs when I already have the .pdf's and can use the OCR to extract the scanned contents of these files using Adobe Acrobat Professional?

I also found 'Adobe Capture 3' to have some advanced OCR capabilities that may be helpful. This is the only Adobe product I do not own though.

Once again, thanks for your time.