Acrobat User Community

How to Save a PDF with Acrobat JavaScript

By Thom ParkerFebruary 18, 2010

This article presents scripts for not only saving a PDF file to disk, but also for saving the PDF to different formats, such as an image file, MS Word, text and even HTML. Being able to save a file to disk is a critical activity for Acrobat workflow automation, and fortunately, there are a couple ways to do this from an Acrobat script. In fact, this is a feature that has been around for a long time, so everything discussed here is valid for old versions of Acrobat as well as Acrobat XI.

Saving in Reader is a little different since this functionality was traditionally off limits, except for specially "Enabled" documents. As explained below, this restriction was mostly removed in version XI, making Reader a much more useful tool in Document Workflows.

So far, I've only talked about using the save feature in the context of automating workflows, but what if you want to put a save button on a form?

Placing a custom save button on a form

This is a common form feature requested in the forums. Let me start right off by saying that putting a script into a PDF form that saves the PDF can only be done under such restrictive circumstances that in most cases it is not practical. In the Acrobat/Reader environment, saving a PDF to disk is a protected operation. It’s a security issue.Users would not be very happy if random PDFs downloaded from the internet could silently save themselves to disk. The user has to know what’s going on. They have to either explicitly save the file using the "File > Save…" menu item, or implicitly allow the save through a trust mechanism.

Trust mechanisms are useful and appropriate in a small or closed environment, such as an office. They are not suitable for widely distributed files. In Acrobat, there are three main trust mechanisms; Actions (batch sequences), trusted functions and digital signatures. All three of these mechanisms provide a "Privileged Context" for code that requires trust. The first two are useful for workflow automation. The last one, digital signatures, is most useful for business documents, such as contracts that are passed back and forth between businesses or within an office. At least one of these trust mechanisms must be employed in order for a script to be able to silently save a PDF to disk.

Given the "trust" restriction, it is possible to place a custom "Save" button on a PDF, but it will only work for users that have the appropriate trust mechanism enabled on their own system.

Saving in Reader

There used to be a saying in the Acrobat/PDF community, "Reader is a reader, not a writer." The original versions of Reader did not have "Save" capabilities, the idea being that it was a free tool only used for reading (hence the name). Writing a PDF required purchasing Acrobat. Despite these sage-like words of wisdom, Adobe did provide a method for saving PDFs from Reader, called "Reader Rights Enabling (or extensions)." A Reader Right is a kind of special sauce that when applied to a PDF, allowed that PDF to be modified and saved in Reader. There are different types of Rights for different types of features, such as forms, markup, and signing. Both Form and Markup Rights became obsolete in Acrobat XI. Anyone can now fill out a form or add markup annotations to a PDF in Reader XI and then save the file. Other modifications, such as digitally signing a PDF, still require the addition of a Right to the PDF. Since anyone with Reader XI can now save filled forms and marked up PDFs, a script can also save a PDF in Reader XI without any special sauce added to that PDF.

So Why Save from a Script?

The primary reason for saving a PDF through scripting is to support workflow automation. A workflow is just the set of actions you perform on your documents in order to process them in your own special way. For example, an accounting office in a large company receives hundreds of invoices from external vendors every day. The invoices need to be logged into the accounting system, verified, paid, and archived. One important step in this process is to mark the invoice so that the current status is clearly shown. In an electronic process, PDF invoices are sent by email, logged into a database and saved as a disk file. Acrobat can play a significant role in this workflow and huge efficiency savings can be gained by fully or partially automating the process steps with JavaScript. So, instead of manually marking invoices, an Acrobat script is used to stamp the PDF file with a status marker and then automatically save it to a new name with the press of a button. Obviously, the ability to save a PDF from a script is an important part of being able to implement such a solution.

How it’s done

There are two ways to save a PDF from a script, the "Save" menu item and the Doc.SaveAs() JavaScript function. Keep in mind that performing a fully silent save requires one of the previously mentioned trust mechanisms. An easy way to try out the code presented in this article is to run it from the Console Window. The Console Window is a Privileged Context, so no other trust mechanism is required for testing scripts.

The simplest methodology is to use the "Save" menu item. Just type this code into the Console Window and run it.

app.execMenuItem("Save");

This code saves the currently displayed PDF file in exactly the same way as when a user selects the "File > Save…" menu item. It works great when the automation script is operating on the current document. However, an automation script could be dealing with several documents at the same time. To handle this situation, the Save menu item can be applied to a specific PDF with this code:

var oMyDoc = <... PDF being operated on ...>
app.execMenuItem("Save", oMyDoc);

Try this code from a script on a form button. It won't work and Acrobat will not report an error. The reason Acrobat does not report an error is because the menu item exists, even though it does not work in the context in which it is being used. This is a tricky situation to debug. Always check your save results by closing and reopening the file. Don't rely on Acrobat to report errors.

Saving to a different file name, folder, and format

The doc.saveAs() function is much more general-purpose than executing the Save menu item. For example, the code below saves the current PDF to a temporary folder using a temporary file name. Use this code where the PDF is an intermediate file in the process:

this.saveAs("/c/temp/temp.pdf");

The code above uses a hard-coded path for saving the file. Notice the format of the path. In order to deal with cross-platform issues, Acrobat uses its own file path specification called the Device Independent File Path Format. If a script does not use this path format, the doc.saveAs() function will not work.

This code saves the file to the same file name, but to a new 'hard coded' location:

this.saveAs("/c/MyDocs/" + this.documentFileName);

The following script uses string manipulation operations to separate the file name from the file path in order to save the file to a new name, but in the same location:

// Split Path into an array so it is easy to work with
var aMyPath = this.path.split("/);

// Remove old file name
aMyPath.pop();

// Add new file name
aMyPath.push("NewFileName.pdf");

// Put path back together and save
this.saveAs(aMyPath.join("/"));

When saving a file, it’s very important to include the entire path. The "doc.saveAs()" function does not automatically use the path to the current file as a base. Always specify the fully qualified path. The code above follows this recommendation by using the current file path.

Sometimes you want the save to automatically overwrite an existing file with the same name, which is the case in the first example where a temporary file is saved. Conversely, there are also times when you’ll want to make sure a file is not automatically overwritten. If there is a conflict, you’ll want to warn the user and give them an opportunity to cancel the operation. For this situation, use the "bPromptToOverwrite" input as shown in the code below:

this.saveAs({cPath:cMyPath, bPromptToOverwrite:true});

The format of this function call is a little different. Notice the use of the curly braces, "{}", and that the input parameter names are explicitly spelled out. This format allows us to specify only the input parameters needed for the operation.

Converting a PDF to a different file format

The "doc.saveAs()" function includes input parameters for converting the PDF to a different file format. These are the same formats listed in the SaveAs dialog displayed when the user selects the "File > Save As…" menu item (Figure 1).


Figure 1 — The available Save Formats can be seen in the Save As Dialog

There are slight changes in which formats are available in different versions of Acrobat. The full listing of available formats for Acrobat XI can be found by running the following code in the Console Window (Figure 2):


Figure 2 — Save Format Names (IDs) used in JavaScript

To convert all pages of the current PDF into JPEG files, use this code:

this.saveAs("/c/temp/test.jpg","com.adobe.acrobat.jpeg");

The file name and path must include the correct the file name extension for the conversion, which is specified in the second input parameter. This conversion ID (or format name) is taken from the app.fromPDFConverters list shown in Figure 2. In this example the conversion is to an image format. Image formats typically don't handle multiple pages, so Acrobat converts each page into an individual JPEG file. To do this, Acrobat appends _Page_# to the file name, so if the PDF had three pages, the file names would be:

test_Page_1.jpg
test_Page_2.jpg
test_Page_3.jpg

This naming convention is the same for all formats where each page is converted into an individual file.

The code below converts the PDF into a single MS Word file, since of course, Word files do handle multiple pages:

this.saveAs("/c/temp/test.doc","com.adobe.acrobat.doc");

Unfortunately, converting PDF files into formatted word-processing files does not always work very well because Acrobat doesn’t always know how to convert the PDF page formatting into the correct structure in the destination file, so be careful with this one.

The cleanest conversion is into PostScript:

this.saveAs("/c/temp/test.ps","com.adobe.acrobat.ps");

PostScript is a vector-based printing format closely related to PDF. I often use this conversion to completely flatten and remove all PDF features from a document. Acrobat can easily convert PostScript back into a clean PDF, so this is a perfect technique to use for converting a LiveCycle PDF form into a flat, archival PDF.

Creating a custom save function

Remember, in order to use the doc.saveAs function, it has to be run from a privileged context. In most situations, this will mean creating a folder-level trusted function. The following code defines such a function that performs only the save operation. This is a full working example, but it could also be used in a larger automation script.

var mySaveAs = app.trustedFunction(
   function(oDoc,cPath,cFlName)
   {
      app.beginPriv();
      // Ensure path has trailing "/"
      cPath = cPath.replace(/([^/])$/, "$1/");
      try{
         oDoc.saveAs(cPath + cFlName);
      }catch(e){
         app.alert("Error During Save");
      }
       app.endPriv();
   }
);

The inputs to this function provide the document object, path and file name. All three are important for creating a generic function for saving a file. For example, inside the folder-level function the keyword "this" may or may not be the current document. The meaning of "this" depends on the calling context, which is unknown. So it is very important to include the document object, "oDoc," even if the function is meant to be used on the current PDF.

If anything goes wrong with the save, such as a bad input parameter, then the "saveAs" function will throw an exception. For this reason, it is encapsulated in a try/catch block, which helps you, the developer, debug the code. The user should never see the alert box because the code calling this function should only pass in good parameters. But if there is a problem, you’ll know about it.

Once created, this folder-level function can be called from anywhere in the Acrobat JavaScript context, including from a script inside a PDF. For example, here’s a form button script for saving the current file back to itself:

// First make sure the function exists
if(typeof(mySaveAs) == "function"){
    mySaveAs(this,this.path);
}else{
    app.alert("Missing Save Fucntion" + Please contact forms administrator");
}

This folder-level function will also work in Adobe Reader XI, or in older versions of Adobe Reader if the PDF is Reader Enabled with Save Rights. See the Scripting for Adobe Reader article for more information, and be sure to read the other articles cited previously. They provide important supporting information and examples, especially the Device Independent File Path Format article.