These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Extraction of Hilighted comments from pdf

eswer_engg
Registered: Sep 26 2007
Posts: 8

Hellow iam new to javascript i want extract the hilighted(Text) comments from pdf file with corresponding page number.
Can u plz assist me .
Thanks
Eswer.

My Product Information:
Acrobat Standard 7.0.0, Windows
thomp
Expert
Registered: Feb 15 2006
Posts: 4411
There are different ways that you can extract comments (called annotations) from a PDF. For transfering and saving comments, Acrobat can extact detailed comment information into an FDF File. You can do this from the Acrobat Menus (Export Data), or with the "doc.exportAnFDF()" JavaScript function. Unfortunately you can't control with comments are extracted.

There is also the "Summarize" feature, that collects and displays information on each comment in a human readable form. There is some control over which comments are summarized, but maybe not exactly what you want. This is also an undocumented JavaScript function that you can modify for your own purposes. But it's pretty advanced stuff and I wouldn't suggest it.

You can also write your own fucntion for summarizing comment data by looping through the list of comments on the PDF. Acquire the list of comments with the "doc.getAnnots()" function.

In order to give you further advice we'll need some more information. What do you mean by "hilighted" comments? Do you mean the "Highlight" type of comment, or Selected comments? What do you mean by extract? Do you want the comment info in a particular file format, like XML, or do you just want a summary of the comments? What information about the comments do you want to collect?

Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script

ki77ku
Registered: Dec 7 2010
Posts: 1
I am looking for ways to automate extraction of comments/annotations in a pdf file into a note pad or an xml file or excel file. The information to be extracted is the annotation and the page number from which this is extracted.

e.g annotaion1 page15|annotation2 page20|.......

I want to automate the process, so if i give the pdf file location and name, then I should get the extracted information into an external text file. Could this be done in Java? If so please let me know how. Any sample code would be of great help.
try67
Expert
Registered: Oct 30 2008
Posts: 2398
This can be done in both Java or JavaScript. If you use JavaScript, Acrobat needs to be opened.
With Java it's possible to do it on a computer that doesn't even have Acrobat or Reader on it.

- AcrobatUsers Community Expert - Contact me personally at try6767 [at] gmail [dot] com
Check out my custom-made scripts website: http://try67.blogspot.com