Acrobat User Community

Scanning to PDF and OCR conversion with Acrobat X

By Kurt Foss – March 10, 2011

In this tutorial, learn how to scan to PDF using Acrobat Scan and how OCR PDF by converting scanned images to text. In chapter 16 of The Acrobat X PDF Bible, you learn how to scan to PDF using Acrobat Scan and how to convert scanned images to text using Acrobat's OCR engine.

Scanning and OCR Conversion in Adobe Acrobat X

This summary is excerpted from chapter 16 of The Acrobat X PDF Bible by Ted Padova, published by Wiley. Follow the link further below to download the complete 34-page sample chapter.

Welcome to the world of one-stop scanning and text recognition. Anyone who scanned to PDF in earlier versions of Acrobat will appreciate the one-step operation for scanning to PDF and
performing text recognition via Acrobat's Optical Character Recognition (OCR) engine and many new enhancements to Acrobat Scan. The one-stop scanning introduced in Acrobat 8 was improved in Acrobat 9 through the implementation of a new scanning technology called ClearS can, the addition of presets that support custom settings you can configure for various source materials, creating multiple files from scanned documents, adding multiple scanned documents to a PDF Portfolio, and much more with Acrobat Scan features.

When performing a scan in Acrobat, you are not limited to scanning documents for text conversions. Acrobat enables you to scan photos and images that might have some other uses. Therefore, this chapter covers all the aspects of scanning from within Acrobat using the Scan to PDF command and the Text Recognition commands.

In chapter 16 of The Acrobat X PDF Bible, you learn how to scan documents using Acrobat Scan and how to convert scanned images to text using Acrobat's OCR engine. Topics covered include:

  • Acrobat Scan provides choices for using presets and customizing presets with WIA-compliant scanners on Windows.
  • Create PDF From Scanner on the Mac and when using noncompliant WIA scanners use TWAIN drivers or Adobe Photoshop Acquire plug-ins.
  • Properly preparing the scanner and documents for scanning in Acrobat improves the quality of the scans. The scanner platen should be clean, the documents should be straight, and the contrast should be sharp.
  • When scanning images in Acrobat, use the scanning software to establish resolution, image mode, and brightness controls before scanning. Test your results thoroughly to create a formula that works well for the type of documents you scan.
  • Workflow automation can be greatly improved by purchasing Adobe's stand-alone product Adobe Acrobat Capture. When using Adobe Acrobat Capture with a scanner supporting a document feeder, the scanning and capturing can be performed with unattended operation.
  • Acrobat Capture is a stand-alone application for optical character recognition used for converting scanned images into editable text.
  • Acrobat uses a new technology called ClearScan that replaces the Formatted Text & Graphics PDF output style used with earlier versions of Acrobat.
  • Text can be converted and saved as a PDF ClearScan output style, where you can edit text and change the appearance of the original scan. Text can be converted with Optical Character Recognition and saved using the Searchable Image option, which preserves the original document appearance and adds a text layer behind the image.
  • Acrobat enables you to perform OCR text recognition on multiple scanned documents saved in any format compatible with the Create PDF From File command.
  • OCR suspects are marked when the OCR engine does not find an exact word match in its dictionary. Text editing is performed in the Find Element dialog box.
  • To import text into Microsoft Word, use the File > Save As > Word Document command in Acrobat to export to a Word file and open the exported file in Microsoft Word.
  • Scanned paper forms can be populated with form fields when you enter Form Editing Mode. Acrobat automatically uses auto field detection to place field objects on a page when you enter Form Editing Mode.
  • You can convert a paper form to a fillable PDF form using a single menu command in Acrobat.
  • Digital cameras can be used in lieu of a scanner and can often speed up the scanning process.

Download a PDF file of chapter 16 from The Acrobat X PDF Bible on Scanning and OCR Conversion. [PDF: 786kb]