These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

OCR without loss in quality

crackhammer
Registered: Apr 29 2010
Posts: 12

Hello folks,

I am currently scanning my books to make elibrary.

Usually I scan at 400dpi to pdf file using our photocopier scanner and OCR afterwards using Acrobat Pro 9 (on mac) as Searchable Image and 300dpi downsampling. Most of the times the results are satisfactory. I perform OCR just be able to highlight text using highlight tool in Acrobat.

Now, I have scanned a big, about 550 pages, book. I made some mistake while scanning and the scan is really awful. The image looks like Figure 8 on http://www.acrobatusers.com/tutorials/scanning-and-ocr-beyond-basics If I try to OCR this file, the quality gets worsened and I have difficulties even reading it sometimes.

I wonder if there is any tweak where I can retain the resolution, yet perform OCR.

Many thanks in advance.

(P.S. - My Acrobat version is 9.3.3)

crackhammer
Registered: Apr 29 2010
Posts: 12
Just figured that these pdfs were scanned at 100ppi. I don't know if ppi stands for pixels per inch but I figured from Advanced -> Preflight option.Anyone has any idea now?

Thanks in advance.
daka630
Expert
Registered: Mar 1 2007
Posts: 1420
Scan at a minimum value of 300 ppi.
This will provide acceptable OCR results while not leaving too large a file footprint.
Sometimes, poor quality source paper will require 400 ppi or even 600 ppi.
These files have a larger foot print but OCR accuracy becomes acceptable.

As one goes below 300 ppi, OCR accuracy declines (quickly).

"ppi" - pixels per inch (for those "electronic" things)
"dpi" - dots per inch (for those "paper" things - anyone recall using 9-pin dot matrix printers? )Good to Google "effective resolution".
Much info on photo related sites/blogs/etc. - all applicable to a scanned image, which is after all an "image" ;-)

Be well...

Be well...