Get and Set PDF metadata from Acrobat JavaScript

Learn how to set the metadata in a PDF to include parameters such as the document’s title, author, and a set of keywords.

By Thom Parker – June 28, 2006

 

by Thom Parker, Software Developer/Adventurer, WindJack Solutions, Inc.

Scope: Acrobat 5 and Up
Skill Level: Beginner
Prerequisites: Familiarity with Acrobat

Each PDF has a set of metadata that includes parameters such as the document’s title, author, and a set of keywords. In addition to these standard parameters, the metadata also includes custom (user defined) parameters. The importance of document metadata has been espoused many times by many authors (most notably, Duff Johnson). Many systems that handle documents use the metadata for organizing, searching, and even displaying document information. One particularly important example of this is Google. To assist the document creator with handling document metadata Acrobat JavaScript provides complete access, including the ability to create custom entries.

Three different kinds of access

Because of the long and sordid history of document metadata and Acrobat JavaScript there are three very different ways to access the document metadata, but only one good way.

In Acrobat 4, only the standard metadata entries (title, author, subject, and keywords) were accessible. All are properties of the Document Object. These properties were quickly deprecated with the release of Acrobat 5. So don’t use them. Instead, use the info property of the Document Object. This is the gateway to all the document metadata and the best way to access it. For example, execute the following code from the Acrobat JavaScript Console:

this.info.title = "My Summer Vacation"; // Set the Document Title 
this.info.mySpecialProp = 3; // Create a custom entry

To see the effects of this code, use the “File>Document Properties...” menu item to display the Document Properties Dialog. Changes made in either of these locations, in the dialog or the info property, are immediately reflected in the other.

In Acrobat 6 a new metadata property was added to the document object. The value of this property is the complete text of XMP metadata. XMP is the XML format used to store the metadata in a PDF. It is possible to parse and modify this string. Changes made to it will be immediately reflected in both the Document Properties dialog and the info property. But, this is not an advisable methodology. Modifying the XMP directly is much more complex, and therefore error prone, than using the info property.



Related topics:

JavaScript

Top Searches:


0 comments

Comments for this tutorial are now closed.

Comments for this tutorial are now closed.