Hello, this issue has us contemplating leaving the IT environment to open a food joint. Hope you guys can help.
Background:
We have hundreds of thousands of PDF files created over a 12 year span; all of these files contain no metadata whatsoever and are of the historical kind so OCR is nearly non-existent. The files are named using naming conventions given by historians, so each “collection” has its own naming structure completely different from the others.
The Task:
To automatically (thru batch, script or third party software), utilize the existing naming convention of each individual collection and files and populate its own basic metadata fields.
The problem(s):
I could create a script that transposes the directory structure into a CVS file, from there, not sure if I can parse it on to an XML file or if is even possible to make an XMP or FDF file. And assuming that it can be done, how do you make a batch that reads from the file containing the directory structure and incorporates it into the PDF file itself.
Examples
Collection 1: YYYYMMDD-Pub_Type-Pub_Number
Collection 2: Pub_Number- Pub_Type-Author-Desc- YYYYMMDD
From both examples the data can be manually entered into the metadata fields, but since each file is different, it will take forever and a day to accomplish that.
We also contemplated mass murder/suicide but figure it was better to ask for ideas/help… :-)
You can't write directly to the XMP block with JavaScript, only via Preflight or with plugins.