Step 1: Check us out

You don't have to be a member to look at any content on the site. Increase your expertise with our helpful tutorials, videos, forums, and sample PDFs.

Step 2: Sign up for a free account

Like what you see? Take the next step and become a member. Register now to get discounts, attend eSeminars, ask questions and more.

Step 3: Start participating

Get the most out of your membership. Post in the forums, create your profile, submit to the gallery, attend a user group meeting.
Log In now.

Why is JavaScript in Acrobat?

This question rarely comes up, but because of some recent security issues with Acrobat, it has recently been asked on several blogs. JavaScript is, of course, the scripting language used to control interactive features in PDF files and to automate processes in Acrobat. It did not just magically appear one day and attach itself to Acrobat. It isn't a conspiracy. There is a reason.

JavaScript is a programming language developed in 1995 by Netscape as the control (i.e. scripting) language for the Netscape 2.0 browser. The good people at Netscape saw the need to give web designers greater control and flexibility over the elements on an HTML page. Specifically, some kind of scripting language was needed to support HTML form elements. The wise developers of this language foresaw that scripting could be used for more than just form field validation and calculations. They didn't know exactly how or what developers might be using it for, but that's the beauty of programming languages. Inevitably, developers want to do more with them and use them in ways the creators never considered. So they designed a very easy-to-use, self-cleaning and totally general-purpose programming language. But there's more. These wise developers also foresaw that since this scripting language was general-purpose, it could be used to script any application. So they built the language interpreter to be completely independent of the Netscape browser. Instead, Netscape (or any application) communicates with the interpreter through an interface that is customized for the specific application, making it possible to easily integrate this scripting language into any application. And that's how JavaScript was born.

Enter Acrobat

About the same time the Netscape developers were creating JavaScript, Adobe decided to add interactive form fields to PDF. Acrobat users were already creating nicely formatted and printable PDF forms for distribution by electronic means. The general-usage scenario was that a common form -- such as a rebate, registration, application or any kind of form already in use -- could be converted or scanned into a PDF file, then distributed by electronic means. In fact, many forms were already available as PDF documents since it was the preferred format for printing services. This was in the same timeframe in which the public internet was just starting to really get going. So posting forms to the internet was popular. Of course, the idea was that the form would be downloaded, printed, filled out by hand and then conventionally mailed back.

Seems a bit primitive by today's standards, and that's exactly what people back then thought, too. Everyone was saying, "Hey, why can't I just fill this out on my computer and email it back." This idea made sense to everyone, including Adobe, which promptly added interactive form fields to the PDF language. However, to implement form fields, there would need to be some kind of scripting language for, at a minimum, entering calculations and validating user input. You must have a scripting language for this, because there is no way to predict and handle all the different things that form designers want to calculate and validate.

At this point, the Adobe developers could have decided to implement their own scripting language. It probably would have been called AcroScript. It's an interesting idea and a cool name, but even simple programming languages require an enormous effort to design and maintain. Netscape was offering a full-featured programming language ready to be integrated into any application. And not only does Netscape update and maintain the JavaScript interpreter, they also give it away for free. That's a deal that's hard to beat; and that's how JavaScript got into Acrobat.

Once you start using a scripting language, you can't stop. Scripting languages are typically simple, easy-to-use, and perform powerful operations with a single line of code. Once Adobe started getting feedback from customers, they realized that the form designers wanted to script anything and everything. As with web pages, interactive and automated features in PDF documents are very popular. So Adobe expanded the JavaScript Object Model like crazy and added scripting to any feature where a little programming might be handy. Scripting is no longer just about forms. For example, some of the more popular uses are for controlling comment and review sessions, 3D models, multimedia, and automating document processes in Acrobat.

Security concerns

In HTML files, JavaScript is used to control dynamic and interactive browser (i.e. web page) features. Because the JavaScript code is in the HTML page itself, it is run in the browser on the user's system. There is a very valid concern that the script not be able to affect the user's system without their knowledge. This has been a persistent problem with browsers (and it still is), and one constantly being addressed by the developers of the popular operating systems and browsers.

On the other hand, in file formats used for design purposes, such as a Photoshop or InDesign, security is not a concern. The files themselves do not contain any script code and are not widely or arbitrarily distributed. For design applications, the scripts live on the local system and scripting is used purely to create macros and automate repetitive tasks.

PDF files are a finished, distributable document format, and Acrobat is both a design tool and a document viewer. This puts Acrobat/PDF scripting in a particularly awkward position. In Acrobat, scripting is used in both ways -- to control dynamic and interactive document features, and to create macros and automate repetitive tasks. These uses are slightly at odds with one another. To provide for user security, the dynamic/interactive document scripts have to be isolated so they do not interact with the user's system. But for the automation scripts to be effective, they must be able to access the user's system and manipulate PDF documents.

This dual-usage security issue was known from the very start, so Acrobat JavaScript operated from a two-level, sandboxed environment. Sandboxed means that all scripts have restricted access to the user's system. Two-level means that scripts on the user's system were allowed to perform major document manipulations, and scripts in a PDF file were not. This covers both usage models, but in the early days, the security model was a bit blurry and weak. Over the years, security concerns have become much more severe and Adobe has enhanced its security model (often breaking earlier scripts and enraging developers). In Acrobat 7, the concept of privilege was introduced. Privilege formalized the separation between the two uses for scripting. Automation scripts are trusted, and therefore operate in a privileged context. PDF scripts are untrusted (except under very special circumstances), so they cannot access privileged functionality. This separation is even more severe in Adobe Reader, which does not implement many of the operations Adobe considers a security risk.

It's interesting to note that during its entire history in Acrobat, the scripting model itself has never posed a serious security threat. There were some small holes here and there that allowed a PDF script to write a file to the user's system, and maybe even for some user data to be collected without warning the user first, but nothing more direct than this. The most any malicious script could ever do is spoof the user. Spoofing means fooling the user into doing something they shouldn't, such as sending a password or credit card number to someone who shouldn't have it. With the latest version of Acrobat, all of these holes, perceived and real, have been pretty well closed. As a developer, the security is a serious pain. It means greater difficulty in creating truly powerful automation scripts and easy-to-use interfaces. But as a document user, I seriously appreciate the PDF document security, which means a hacker can never create a malicious PDF document using the Acrobat JavaScript model.

And as for the JavaScript model itself, Adobe would be courting madness to ever think of removing it. Countless customers rely on the scripting features, particularly interactive forms, but also automation, multimedia and so on. These customers range from the smallest hobby clubs to the largest corporations in the world. PDF documents are used across the board for an endless variety of electronic document applications.

So what's the real security issue?

If Acrobat JavaScript is so tightly wrapped up, then why am I reading it's a security problem? Even Adobe recently recommended that users turn off JavaScript until a patch was released for a problem discovered/reported early this year. Well, at one time HTML JavaScript had some pretty scary holes in it as well (many rich internet features still do). This unfairly makes JavaScript an easy target, and people who don't understand what's going on tend to go a bit "chicken little" on their blogs.

In fact, the recently reported security problems (and nearly every other bug and security hole in Acrobat) have nothing to do with the JavaScript model. The underlying culprit is what's called a memory leak. The term "memory leak" had a specific meaning at one time, but it's come to also mean any action by the software application that improperly handles memory resources. Leaks include things like overwriting the end of allocated memory, using invalid pointers to memory, and not freeing allocated memory when it is no longer needed (the original meaning). All of these issues may or may not cause unexpected behavior. Application bugs caused by a memory leak are very indirect and notoriously difficult to track down. But for the most part, memory leaks don't cause consistent bugs, only an errant crash or lockup every once in a great while. For this reason, they are very easy to ignore.

Acrobat is full of unhandled memory leaks from way back to the early versions; some memory leaks have been carried over in the code base for years. Now the real security issue with Acrobat is that some of these memory leaks overwrite what's called the frame stack. This is where an application stores pointers to code that it's going to run at some later time. They are not meant to be changed. So if one of these pointers can be overwritten with a different pointer, then an outside user can force Acrobat to run something it was never meant to run. And that's exactly what's happened. There are several different ways to create this situation, and certain JavaScript actions happen to be one of them.

The JavaScript model in Acrobat is about as secure as a scripting model can get, and it has never had a serious issue. The only reason Acrobat has security issues is because Acrobat has memory leaks, leaks that should have been fixed years ago. The Acrobat developers are now in the process of fixing these leaks, and when they do, there will be no more security holes in Acrobat. And most especially not in Acrobat JavaScript.

Tech Talks

Go deeper into Acrobat through a new series of informal technical talks by Acrobat experts.

Tech Talks >

Membership

Sign up for your free membership today and save up to 40% on books, training, and more.

Join for free >

Acrobat Job Board

Looking for a job or seeking to fill a job? Check out the new Acrobat job board.

Job Board >