These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

extracting data

ssuttle
Registered: Mar 29 2007
Posts: 2

Hi all,
 
I have a multi page pdf containing invoices and want to extract a fixed field - invoice number to a text (csv) file. Can anyone help.
 
Regards
 
Simon

My Product Information:
Acrobat Pro 7.0.9, Windows
thomp
Expert
Registered: Feb 15 2006
Posts: 4411
If this is a fillable PDF then getting the data is a simple line of JavaScript code. If it's a generated document, from quickbooks for example, then this becomes much more difficult. Use the "doc.getPageNthWord()" function to search words on each page. The tricky part is figuring out which one is the invoice number. If it's a scanned PDF, then you have to do OCR first and hope everything translates well.

Once you have the data, getting it into an external DB is not straight forward. There are two things in Acrobat JS that might help, the "ADBC" object and the "doc.exportAsText()" function.

Alternativily, you can use the text select tool to manually copy and paste the data into a file.

Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script