Cleaning up your PDF documents
by Donna L. Baker, ACE, Baker Communications
Take out your PDF files' digital trash in Acrobat 8. Use Examine Document features for regular tidying up; for those tough cleaning jobs turn to the PDF Optimizer.
There is much more to a PDF document than the obvious text and images. Other entities range from comments and attachments to layers and information from the document’s source program that you may not want or need to save. Acrobat 8 Professional offers two mechanisms for examining and dealing with document content. To remove general types of extraneous material--such as comments and hidden textturn to the new Examine Document command. For making a document the most efficient file size without sacrificing quality, use the PDF Optimizer.
Picking the right method
Both the Examine Document and PDF Optimizer processes include some common features. In both cases, Acrobat will check a document and locate extraneous material according to your selections. The Examine Document process is much simpler, and looks for specific features. The PDF Optimizer offers several lists of options you can configure according to your file’s configuration, contents and your desired output.
In essence, the Examine Document feature is a subset of the PDF Optimizer and contains the most prevalent and commonly removed file elements. Let’s take a look.
Examining your document
Open the file you want to evaluate and follow these steps:
1. Choose Document > Examine Document. Acrobat processes the active document and opens the Examine Document dialog. Only those items present in the file are active and indicated by checkmarks (Figure 1).

Figure 1: Select the existing content to remove from the file.
Tip: An Alert dialog displays if there are attached PDF files. The dialog explains that the process applies only to the curren t file, and that you have to open the attachments and run the process separately for examining the attachments.
2. Check through the features identified. The types of content examined are listed in Table 1. To see hidden text, click Preview to open the Hidden Text found in [filename] dialog (Figure 2). Here you can review what has been hidden on the page. When directional-arrow buttons are shown below the preview, as in the figure, click to display hidden text on various pages of the file. Click OK to close the dialog and return to the main Examine Document dialog.

Figure 2: Review textboth hidden and visiblein the preview dialog.
See larger image
Tip: “Hidden” text refers to text that is hidden beneath an image or using the same color as the page background, making it indistinguishable from the rest of the page.
3. Review the listed information on the Examine Document dialog indicated by the caution icon. There are other program items that are automatically removed when you delete content using the Examine PDF process. Digital signatures, for example, will be removed, as will Reader-enabled rights and metadata used in a commenting workflow.
4. Clear checkmarks for any features you want to preserve and then click “Remove all checked items” to close the dialog and process the file.
5. When processed, a dialog opens summarizing the changes made (Figure 3). Click OK to close the dialog.

Figure 3: A summary of changes is shown after the file is processed.
6. Save the document to preserve the changes. Once the document is saved, you can’t revert to its unprocessed state; be sure to save it with another name if you may have to return to the original content.
Table 1. Types of information that can be removed
Content type |
Type of content |
Where to view in Acrobat |
|
Metadata |
Information about the document used for searching and managing files, such as content, keywords, subject and author’s name |
Choose File > Properties |
|
File attachments |
Files attached to the file before generating the PDF file from the source program, or those attached within Acrobat |
Choose View > Navigation Panel > Attachments |
|
Annotations and comments |
All comments added to the document using the Comments & Markup tools, any files attached as comments |
Choose View > Navigation Panel > Comments |
|
Form field logic or actions |
Form fields--including signature fields--are flattened, actions and calculations are removed |
Choose View > Navigation Panel >Fields or View > Navigation Panel > Signatures |
|
Hidden text |
Text in the PDF document that is covered by an image or other text, is transparent, or uses the same color as the background |
Click Preview on the Examine PDF dialog |
|
Hidden layers |
The layers in a document can be specified as hidden or shown; hidden layers are deleted and remaining layers are flattened into a single layer |
Choose View > Navigation Panel > Layers |
|
Bookmarks |
A panel listing of text items used for linking to specific locations and magnifications in the document |
Choose View > Navigation Panel > Bookmarks |
|
Embedded search index |
A text index embedded into a file to speed up search processes |
Choose Advanced > Document Processing > Manage Embedded Index |
|
Hidden page and image content |
Content removed from a document such as text or images or cropped page segments are included as file elements |
-- |
Tip: The next time a document doesn’t behave properly, such as not printing or saving correctly, try using the Examine Document process to clean up the file. The problem may be solved by removing content such as metadata and data from other applications.
Auto-exams
Some workflows include routine examining of each PDF file for hidden content before sending it by e-mail or filing it. If your workflow involves sensitive material, such as legal documents, rather than trying to remember if you have processed the file, change the program preferences.
Making it legal
As I mentioned, some workflows include protecting sensitive material, such as those used in legal proceedings. Acrobat 8 Professional offers tools designed specifically for legal documents, including Bates numbering and redaction. The Examine Document process is used as part of routine legal PDF document processing. For more information, refer to the article below on “New Acrobatics” by Stephen Bird in a recent issue of “The Lawyer’s PC.”
New Acrobatics
Download [PDF: 1.7 MB]  
Choose Edit > Preferences > Document (Acrobat > Preferences > Document on Mac). Select either or both of “Examine document when closing document” and “Examine document when sending document by email”. Click OK to close the Preferences dialog. Each time you close a file or use an e-mail command, the file is examined. Of course, if you attach a PDF file to an e-mail message outside of Acrobat, the file isn’t examined first.
Optimizing PDF documents
Documents containing multimedia and those that have undergone significant editing in Acrobat require more content extraction than that provided in the Examine Document process. For example, a file to which you’ve added and removed or changed multiple renditions of multimedia will store unnecessary information about the renditions and changes.
Choose Advanced > PDF Optimizer to open the dialog. The first step is to analyze the document to see its contents. Click Audit space usage at the upper right of the dialog. Acrobat examines the document and displays a report (Figure 4).

Figure 4: Review the document’s components before optimizing it.
The report lists the percentages of the entire document size and size in bytes for elements such as comments, form fields and images. In the example, the embedded files use over 90 percent with images following at a distant nine percent. Click OK to close the audit report.
Configuring options
The default settings for optimization in the PDF Optimizer are the same as those of the document. If you click the Make compatible with pull-down arrow and choose another program version, the Settings name in the upper left of the dialog changes from Standard to Custom, as shown in Figure 5.

Figure 5: Review the document’s components before optimizing it.
The options available in the different panes of the PDF Optimizer vary according to the program version selected in the Make compatible with list. The set of optimization categories are listed in a column on the PDF Organizer’s dialog, shown in Figure 6. Select or clear checkboxes to include or remove categories of optimization settings.

Figure 6: Customize dozens of settings in the PDF Optimizer to balance the quality of the document against the file’s size.
Click a label in the left column on the dialog to display settings at the right of the dialog. As you look through the panes of the dialog, deselect items that you don’t want to optimize.
Here are some common types of optimizing to consider:
- Specify settings for images in your document, including color, grayscale and monochrome images. You can define compression types, quality and downsampling values, terrific for saving a copy of the file for quick webpage downloading.
- All fonts used in the document are listed in the dialog. Unembed fonts you don’t need embedded, such as common fonts or system fonts.
Tip: Don’t stop there: if you are distributing a document internally, and know the recipients can access a common set of fonts, don’t bother leaving any fonts embedded.
- By default, the Transparency settings aren’t active in the PDF Optimizer. Depending on your document contents, choose transparency flattening and settings, such as resolutions for text, line art and gradients.
- Select objects to remove in the Discard Objects settings. You can select objects such as form fields, alternate images and search indexes.
- Decide what user data objects can be removed from the document, such as layers, form content, cross-references and comments on the Discard User Data pane.
Note: Options available on the Examine Document dialog are offered on the Discard Objects and Discard User Data panes.
- Choose commands from the Clean Up Settings pane to take care of other cleanup details, such as removal of invalid links or bookmarks, encoding options and a method of compressing the document’s structure.
When you’ve finished choosing optimization settings, click Save to close the dialog and open the Save Optimized As dialog. Click Save to overwrite the original file or save the file with an alternate name.
Reusing optimizations
Configuring PDF Optimizer settings can take some time, but you don’t have to repeat the process each time you optimize a file. If you find you are configuring the same settings over and over, click Save on the PDF Optimizer dialog to name and save the settings.
The next time you have to prepare the same sort of file for distribution, choose Advanced > PDF Optimizer to open the dialog, click the Preset pull-down arrow and choose your custom settings (Figure 7). Click OK to process the file.

Figure 7: Select your customized settings rather than configuring new PDF Optimizer configurations.
I’d like your feedback:
Do you have a routine method for preparing documents for distribution or storage? How did you develop your method? Aside from the tools and features mentioned in this article, what Acrobat features do you use?







