These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

help to scan technical documents into quality pdfs

rpesq
Registered: Apr 12 2009
Posts: 2

Hi,

Newbie using Acrobat 9 Standard (updated to 9.1) with new Fujitsu S1500 document scanner.

I need to scan to pdf a number of technical documents -- no color, these are white background (or, at least, should be white -- some are old and a bit discolored), mostly text and some grayscale illustrations/images and electronic schematics.

Been playing with various options -- scanning resolutions, B&W vs Grayscale, and subsequently using the "Optimize Scanned PDFs" (in Acrobat 9.1) for several hours, and not making much progress. The issue is that the scanned documents do not print out as good as the originals, and I do not have enough expertise to know what PDF compression settings I should be using.

I am hoping that some Acrobat pro's familiar with document scanning can provide some advice.

My goals:

1) PDF needs to print out as close to original as possible

2) File size optimized for small size as much as possible, as some of the documents will be 100-pages and may need to be emailed.

3) If possible, is there a way to manually clean-up the scans in Acrobat? For example, removing the occasionally spot / mark / staple-hole that shows up in the scan. The PDF Optimizer settings, even agressive, do not seem to remove these marks. Since the background of all of these documents is white, a simple way to "clear" the spots to a white backgound would be perfect.

My workflow currently consists of:
- the scanner has options for Resolution, Color/Auto, and PDF compression size. Right now, I have been using Auto for color (seems to properly detect these as either B&W or Grayscale), I have upped the scan resolution to 600, and using the least PDF compression possible. The scanner than saves this PDF, which I work on in Acrobat 9.1. It is my understanding that I cannot scan directly from Acrobat with this scanner -- I must go thru the Fujitsu interface first, let it create the original PDF, then "optimize" it.

- Once I get to Acrobat, I am lost as to what to do. When the scanner detects a certain pages as Grayscale (pages that had a grayscale image in them, for example), the text is ~slightly~ blurry -- could use a tiny bit sharpened.
Also, since the scan resolution was high and least compression used, the PDF is very large. When I use the Optimize PDF Acrobat function, it certainly squishes the file size, but I feel that I could be doing this better. The printed-out page is acceptable, but definitely lower quality than the original.

Any suggestions regarding the best way to approach this project, and best settings to use, will be GREATLY appreciated. Thanks

My Product Information:
Acrobat Standard 9.1, Windows
daka630
Expert
Registered: Mar 1 2007
Posts: 1420
rpesq,

Some comments and observations.
Look over the discussion on scanning in Acrobat Help. For updated Help use Adobe's LiveDocs.
My read of this indicates that you ought to be able to feed the paper to your scanner and have Acrobat pull it in and make the PDF.
If the local machine having Acrobat has the scanner attached or can see/talk to a network attached scanner this ought to be the case.
If not, scan to TIF. Park these somewhere. Get a copy to the local machine. Point Acrobat to it & go.In Acrobat, go to Preferences and access Convert To PDF. Edit, as needed the settings for TIFF.

Use of "agressive" and "adaptive" can result in compromising fidelity with original.

You describe textual content. Use B&W for the scanning. Typically, use 300 ppi for resolution.
The older, "dirty" hardcopy may benefit from 400 or 600 ppi resolution.
Sometimes, running the poorer quality originals through a good copier with tweeks
on contrast and toner density can yield an improved "master".

The lower the resolution the poorer the OCR output.

"Blurrry" text in greyscale is fairly typical. Stay with B&W.Scanned images are not going to be 1:1 with originals. You can get close with expensive gear; but, I suspect that would be high resource input
for a rather small output gain.

Your desire to clean up spots, marks, etc. assures that the image will lack fidelity to the original.
Cleanup routines don't know a decimal from a spot.
1.10 liters can become 110 liters.

Compression for keeping fidelity high:
Monochrome Compression: CCITT G4
Grayscale Compression: ZIP
Color Compression: ZIP

JBIG2 lossless will also promote fidelity to original.

Default settings for JPEG compression do the job by removal of pixels. Fidelity to original is compromised.


When your scanner makes the image and processes this out to a PDF it is most likely performing "editing" actions
associated with the image content. Look under the hood to identify what these are. They dictate what you are presently
getting for PDFs. Acrobat can do some really neat things, BUT, it cannot turn a pig's ear into a silk purse.


File size:
If fidelity to original is important then the size is what it is.
These days most email servers can take in reasonably large attachments.
If need be, on a case-by-case basis, you could split any excessively large files into smaller parts for email.

Size for greyscale and color will be larger than B&W.If not done already, look over what is done by Deskew, Background removal, Edge shadow removal, Descreen, and Halo Removal.
What settings are acceptable for the intended use by end-users?

If OCR is to be done use "Searchable Image (Exact) to leave the image integrity intact.

Be well...

Be well...