These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

I've lost hundreds of pages of my document during the OCR process..

shf781
Registered: Mar 8 2009
Posts: 6
Answered

I'm using Pro version 8.1.3 on a Windows machine.

I'm working on a pdf that's now 2004 pages. It is marked by about 75 different bookmarks. Once I went through the document and added headers/footers, comments, sticky notes and circles, I OCR'd it. After I OCR'd it I saved it. When I went through the document again, I found pages 308-757 were virtually blank. My header/footer, sticky notes with remarks, comments and circles remain, but the page is otherwise blank. When I left click and hold the mouse and drag it across the blank pages with the Highlight Text tool on, Acrobat will highlight areas of the document that had text. But the text is still not revealed doing this.

Did I wreck this thing for good? Do I need to replace and re-mark those 400 pages of "missing" docs. Or is there a setting I can adjust to see the images again?

Thanks.

Todd

My Product Information:
Acrobat Pro 8.1.2, Windows
Lady Cygnus
Registered: Mar 17 2009
Posts: 19
I had this happen once with White text on a black background - it turns all the text white or something weird like that.

When you OCR use the following settings:
- Primary Language: English
- PDF Output Style: Searchable Image (exact)

The file will be slightly bigger, but it will look exactly like it does now.
Lady Cygnus
Registered: Mar 17 2009
Posts: 19
shf781 wrote:
It is marked by about 75 different bookmarks. Once I went through the document and added headers/footers, comments, sticky notes and circles, I OCR'd it. After I OCR'd it I saved it.
As a suggestion (since this isn't quite clear), I would suggest never saving over your original file after an ocr. Rename it with "ocr" or something, or at the very least keep an original backup.

1. If you have not saved over the pre-ocr file, simply run the ocr on that file again.
2. If you have saved over the pre-ocr file and can get the original (pre-notes file) then you can do the following:

- In messed up OCR PDF > Comments > Export Comments to Data File > Save
- Check the original PDF for page consistency (did you delete any pages when adding comments, etc)
- Resave the Original PDF as a different name (ie: Test - just to be cautious)
- In Test PDF > Comments > Import Comments > File you just savedYou'll now have all your comments back in (this will include all circle, pencil marks and sticky notes). I think you'll need to add the headers/footers back in by hand - but I don't work with those.

3. If you don't have an original and want to "undo" the bad OCR...I don't know of a way that can be done.
shf781
Registered: Mar 8 2009
Posts: 6
Thank you Lady Cygnus. I've put the document back together using a previously saved copy to replace my mysteriously white pages.

I was not aware of the "exact" image setting. That's what I would want anyway and assumed that's what I was getting...

Thanks Again!
rbogie
Registered: Apr 28 2008
Posts: 432
its advisable to apply OCR before all the other stuff. also you can remove OCR'd hidden text with the examine document tool (menu document > examine document). Save a backup and experiment.