Hi:
What is considered the best or usual way of viewing, and editing/correcting the "invisible" text behind the bitmap image after OCR on a scanned document?
It appears that you can view the underlying text via Document | Examine Document | Hidden Text (Preview), but this doesn't seem to be very practical. Here's what I see (copied into Word for pasting here):
pennittedtobeproduced,therewouldbenowaytomarkthethousandsofdocumentswithintheCDforfuturemotions.Intheeventoffuturemotions,theCDwouldneedtobeproducedaspartofthem
Not very helpful, to say the least.
What I've done in the past is to copy the whole doc (Ctrl-A Ctrl-C) and paste it into Word (Ctrl-V) and then run a spell-check on it. But while that yields a correct document from a spelling point of view (or can, if you have the patience to make the corrections), it doesn't correct the original PDF to improve its searchability.
What I'm doing now is running the scan-OCR with Abbyy Finereader 9. It allows correction of the underlying OCR text, while presenting the original scanned bitmap to the user (or vice versa per an option). If anybody has a better suggestion, please let me know.
-- Roy Zider
Would it not be less time intensive to have the source file used to output a PDF?
If the document authors do not have Adobe products installed then, perhaps they
could obain and use the free Cute PDF product.
The output PDFs are most basic but do contain renderable text which supports
find/search in an adequate manner.
Be well...
Be well...