These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Get the text of a link

Danious
Registered: Apr 14 2008
Posts: 10

Hello,

I need to get the text of a link in a page of my pdf, in a script launched when a page opens. With this.getPageBox("Crop", p) and this.getLinks(p, b) I'm using the SetAction method and it works well, but I don't know how to recover the text of the link.

Thanks

My Product Information:
Acrobat Pro 8.1.2, Windows
Danious
Registered: Apr 14 2008
Posts: 10
If someone is interested in a (very ugly) solution :

    for ( var p = 0; p < this.numPages; p++){var b = this.getPageBox("Crop", p);var l = this.getLinks(p, b);var numWords = this.getPageNumWords(p); var linkStrings = new Array(); if(l.length > 0){for (var init = 0; init < l.length; init++)linkStrings[init] = ""; for (var j = 0; j < numWords; j++){var quads = new Array();quads = this.getPageNthWordQuads(p, j).toString().split(',');for (var i=0; i < l.length; i++){if( l[i].rect[0]-1 <= quads[0] && quads[0] <= l[i].rect[2]+1 &&l[i].rect[0]-1 <= quads[2] && quads[2] <= l[i].rect[2]+1 &&l[i].rect[0]-1 <= quads[4] && quads[4] <= l[i].rect[2]+1 &&l[i].rect[0]-1 <= quads[6] && quads[6] <= l[i].rect[2]+1 &&l[i].rect[3]-1 <= quads[1] && quads[1] <= l[i].rect[1]+1 &&l[i].rect[3]-1 <= quads[3] && quads[3] <= l[i].rect[1]+1 &&l[i].rect[3]-1 <= quads[5] && quads[5] <= l[i].rect[1]+1 &&l[i].rect[3]-1 <= quads[7] && quads[7] <= l[i].rect[1]+1 ){linkStrings[i] += this.getPageNthWord(p, j, false).replace(" ","").replace("\t","").replace("\n","");}}}}}

A better solution certainly exists...
ITFLA
Registered: Jan 28 2011
Posts: 3
This works pretty well. Punctuation symbols (.-,) do not appear but the "words" are all there and the speed is good.



for(var page = 0;page < this.numPages;page++)
{
var b = this.getPageBox("Crop", page);
var l = this.getLinks(page, b);
console.println("Page " + page + " has " + l.length + " links");
var numWords = this.getPageNumWords(page);
if (l.length > 0)
{
for(var iLink = 0;iLink < l.length;iLink++)
{
var target = (l[iLink].rect);
var result = "";
var selection;
for(var i = 0;i < this.getPageNumWords(page);i++)
{
selection = this.getPageNthWordQuads(page,i);
if(target[0] <= selection[0][2] && target[1] >= selection[0][5] && target[2] >= selection[0][0] && target[3] <= selection[0][3])
{
result += this.getPageNthWord(page,i) + ' ';
}
}
console.println("Page " + page + " Link " + iLink + ": (" + result + ")");
}
}
}
try67
Expert
Registered: Oct 30 2008
Posts: 2398
If you set the bStrip parameter of getPageNthWord to false it will return the punctuation marks as well.

- AcrobatUsers Community Expert - Contact me personally at try6767 [at] gmail [dot] com
Check out my custom-made scripts website: http://try67.blogspot.com