I am attempting to write a batch file. I want the batch to look at a 5000 page document and specifically look at lines 4 and 5 of each individual page. I then want it to automatically insert bookmarks based on the information read. Also, it needs to omit any and all duplicates. Can this be done?
First, Text in a PDF is not organized in lines. Each piece of text is place with X and Y coordinates. But you can estimate the location of the lines you want to examine and then search for the text in the lines based on location. You'll these two Acrobat JavaScript functions:
doc.getPageNthWord() and doc.getPageNthWordQuad()
Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script