These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Removing Office Files from an Acrobat Portfolio

Suzanne Engel
Registered: Jan 24 2011
Posts: 6
Answered

Hi,
I've created an Adobe Acrobat Portfolio that includes both word documents and pdf files. I would like to publish the portfolio so only the pdf files are available. The word documents are not protected and I don't want to provide people with access to the word documents.
 
I have over 600 files that will need to be delted from or would need to be protected in word, in this portfolio. The only way I have come up with so far to remove the word documents but keep the pdf files is to search for the .doc and go through and individually delete each file. There has to be an easier way to do to this. Any suggestions would be greatly appreciated!

My Product Information:
Acrobat Pro 9.0, Windows
Merlin
Acrobat 9ExpertTeam
Registered: Mar 1 2006
Posts: 766
Hi,

the "Export Portfolio Metadata to Console" feature of Joel's utilities may helps you : http://blogs.adobe.com/pdfdevjunkie/2008/10/joels_pdf_portfolio_utilities.php

;-)
thomp
Expert
Registered: Feb 15 2006
Posts: 4411
An automation script could easily walk through the attachments and delete the doc file.

foreach(var oAtt in this.dataObjects)
{
if(/\.doc$/.test(oAtt.name))
this.removeDataObject(oAtt.name)
}

Run this from the console window

Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script

Merlin
Acrobat 9ExpertTeam
Registered: Mar 1 2006
Posts: 766
thomp wrote:
An automation script could easily walk through the attachments and delete the doc file...
Waow, such another great info!
Thomp, you're a javascript gold mine.
;-))
thomp
Expert
Registered: Feb 15 2006
Posts: 4411
Accepted Answer
Thanks, this is pretty much what I do all day, every day.

But the script is just a quick one off and has some big errors that I forgot to fix before posting. Like the syntax error in the "for each"

1) Because elements are being deleted from the list the loop needs to run backwards.

2) it should really be testing the MIMEType, which is a more reliable indicator of the file type.

for(var i=this.dataObjects.length-1;i>=0;i--)
{
if(this.dataObjects[i].MIMEType == "application/msword")
this.removeDataObject(this.dataObjects[i].name)
}



Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script

Suzanne Engel
Registered: Jan 24 2011
Posts: 6
Thank you both so much for your advice. I apologize that I am not more computer literate and I am struggling with implementation of this script.

Here is what I did:
1. Went to the menu bar, clicked on “Advanced,” then clicked on “Document Processing,” and then clicked on “JavaScript Debugger”
2. Copied and pasted the following text in the box under “Console”
for(var i=this.dataObjects.length-1;i>=0;i--)
{
if(/this.dataObjects[i].MIMEType == "application/msword")
this.removeDataObject(this.dataObjects[i].name)
}
4. High-lited the above text within the box and hit Ctrl + Enter key
5. It returned the following:
SyntaxError: invalid flag after regular expression
2:Console:Exec
undefined

Any suggestions? I really appreciate your patience with me!
gkaiseril
Online
Expert
Registered: Feb 23 2006
Posts: 4307
The "if" statement should read:

if(this.dataObjects[i].MIMEType == "application/msword")


Some how the "/" character slipped into the text during cutting and pasting. Not a rare occurance.

So the entire script should be:

for(var i=this.dataObjects.length-1;i>=0;i--)
{
if(this.dataObjects[i].MIMEType == "application/msword")
this.removeDataObject(this.dataObjects[i].name)
}


George Kaiser

thomp
Expert
Registered: Feb 15 2006
Posts: 4411
Thanks for catching that George. I fixed the post to show the correct code. Don't know how that extra slash got in there.

Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script

Merlin
Acrobat 9ExpertTeam
Registered: Mar 1 2006
Posts: 766
Thanks again.
Suzanne Engel
Registered: Jan 24 2011
Posts: 6
I really appreciate your help and PATIENCE with me.
Here is what I did:
1. Went to the menu bar, clicked on “Advanced,” then clicked on “Document Processing,” and then clicked on “JavaScript Debugger”
2. Copied and pasted the following text in the box under “Console”
for(var i=this.dataObjects.length-1;i>=0;i--)
{
if(this.dataObjects[i].MIMEType == "application/msword")
this.removeDataObject(this.dataObjects[i].name)
}
4. High-lited the above text within the box and hit Ctrl + Enter key
5. It returned the following:
"undefined"

Any suggestions would be greatly appreciated. Once again, THANK YOU!
thomp
Expert
Registered: Feb 15 2006
Posts: 4411
It seems to have operated correctly. Were the attachments removed from the PDF?

Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script

Suzanne Engel
Registered: Jan 24 2011
Posts: 6
It removes the .doc files but not my .docX files.
thomp
Expert
Registered: Feb 15 2006
Posts: 4411
The "docx" must have a different MIMEType. Run this code in the console window to see what the mime types are.

for(var i=this.dataObjects.length-1;i>=0;i--)
{
console.println(this.dataObjects[i].name + " = " + this.dataObjects[i].MIMEType);
}

Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script

Suzanne Engel
Registered: Jan 24 2011
Posts: 6
It comes back with 24 pages of feedback. Here is an example of where it says docx

<846>List of Programs.docx = nullDoes that make sense?
thomp
Expert
Registered: Feb 15 2006
Posts: 4411
Yes it does, it means that the docx files do not have a MIMEType. So to remove these files you'll need to modify the delete script to also check for the file extension, like this. Although, how the attachment properties are setup depends on how it's added. In this code the "path" property is used to test for the extension. To cover all the bases the code should also include a test of the "name" property.


rgDoc = /\.docx?$/i
for(var i=this.dataObjects.length-1;i>=0;i--)
{
if((this.dataObjects[i].MIMEType == "application/msword") || rgDoc.test(this.dataObjects[i].path) )
this.removeDataObject(this.dataObjects[i].name)
}

Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script

Suzanne Engel
Registered: Jan 24 2011
Posts: 6
That worked beautifully. Thank you so much for problem-solving that with me. You have saved me numerous hours of work. Thanks again!