These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Extraction of Hilighted annotation in pdf

eswer_engg
Registered: Sep 26 2007
Posts: 8

Thanks for reply
I want to clarify something which u have asked.
I have a PDF file in which some of the text are highlighted with different colours. I want to extract those text from the PDF file with corresponding page number and store it into a text file(Text Format)in local system.
 
//My Code given below//
 
this.syncAnnotScan();
myAnnotList = this.getAnnots({npage: 0,type: "Highlight",strokeColor: color.yellow,nsortBy: ANSB_Page});
//open a new report Doc
 
var myReport = new Report();
myReport.size = 1.5;
myReport.color = color.blue;
myReport.writeText("Summary of Hilighted Comments with pageNo");
myReport.color = color.black;
myReport.writeText(" ");
myReport.writeText("Number of comments: "+ myAnnotList.length);
myReport.writeText(" ");
 
for(var i=0 ;i < myAnnotList.length ;i++)
{
if(myAnnotList[i].type == "Highlight")
{

myReport.writeText("Type: " + myAnnotList[i].type + "; Content: " + myAnnotList[i]+ "; Page:" + myAnnotList[i].page);
}
}
 
var FinalReport = new Report();
var Docrep = myReport.open(FinalReport.pdf);
 
In this code i am getting type of annotation and page number of corresponding Highlighted word or text but i am not getting the highlighted text.
 
I want to get the annotation type ,page number and highlighted text,
Please help me in this regards.
 
Thanks And Regards,
ESWER

My Product Information:
Acrobat Standard 7.0.0, Windows
thomp
Expert
Registered: Feb 15 2006
Posts: 4411
The highlight annotation doesn't really have anything to do with the text. It's just a graphic that sits on top of the page. If you want to find what it's sitting on you'll have to do a search of all the text on the page to find the text that matches the coordinates of the Highlight Annot.

Use the "doc.getPageNthWord()" and "doc.getPageNthWordQuad()" functions. It's actually not as complex as it sounds. Given the code you've already written, this should only be a few more lines. If you search the Acrobat JS Reference you'll find some examples that use these functions.

Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script

eswer_engg
Registered: Sep 26 2007
Posts: 8
Thanks for ur valueable idea ,but iam not able to use doc.getPageNthWord() and doc.getPageNthWordQuad()properly can u please give some code regarding this concern. (ASAP)
catfiche
Registered: Nov 3 2006
Posts: 1
I use a third party program AbridgePDF for just that and it works like a charm. I think it runs about $100 for a single-user license, worth the price if you're regularly highlighting research documents like I am.