These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Scan to PDF, OCR file naming capabilities, software wanted!

tmaddison
Registered: May 10 2008
Posts: 4

Hi,

What I've got seems like a pretty run-of-the mill application for scanning software but I haven't seen anything. Perhaps someone else has?

I need to scan in documents (call them sales receipts) and convert them to PDF for archiving. Problem is I want to scan hundreds at a time, and want to save each as it's own individual PDF, not one giant 100 page PDF.

I don't want to have a human sit there and scan each one individually and assign each a filename. I want the software to scan each page, use OCR technology to find the ticket number on the page (it's always in the same location, in the same font...), then output a PDF of that page with a filename something like [TICKET NUMBER]&[DATE/TIME].PDF

For example, ticket 01234 is scanned in on 5/9/08 at 11:20pm, filename should be 01234200805092320.pdf

And of course it needs to do this automatically for a stack of tickets that are fed through the scanner.

Conceptually, I'd think if there were OCR software that one could define an area of the page to look at, then OCR what's in that area, transfer it to a variable, and use that variable in the file naming process we'd be done. Sounds simple, eh?

Just can't imagine there aren't a bunch of options out there - I can't imagine this would not be useful for every retailer in business who did not want to either keep paper copies of signed invoices OR have humans involved in the scanning, but maybe it's not that normal to do?

Thanks

Todd

davidknapp
Registered: May 15 2008
Posts: 8
We use an application called Metafile Enterprise to do just that. We scan in thousands of documents daily. In our case, a barcode on the first page is read and assigns a searchable key to the document. The software does have OCR capabilities, but we don't use them, as barcodes are MUCH more reliable. Your documents need to be designed for machine processing - a redesign could be in order here. A macro imports the document info to an SQL database with pointers to the actual images. The documents can then be converted to pdf by macro. This is an enterprise application that is pretty expensive at first, but very useful after the training. We have 5 Kodak scanners (60 ppm) and millions of documents that are searchable by up to 50 people at once. Web searches are also available for an extra investment.

Since you are going to scan in documents, you also must have some sort of quality assurance process. Scanners double sheet feed, crumple pages, etc. I say the whole technology is built upon the quality of little rubber rollers. We store health information documents which must be searchable for 20 years. You need to be aware of the whole document life cycle, document prep (what, none are stapled or folded?), scanning, Q/A. storage (note, each scanned page can get pretty big if color is involved), reporting, searching, long term storage (we use DVDs to keep off site images, and have had problems with them after 5-7 years). There are lots of other apps, but I doubt if any really good ones are off-the-shelf. Good luck.
tmaddison
Registered: May 10 2008
Posts: 4
Thanks. I've checked out their site, it DOES look great but although it's not clear what it costs to just buy the software outright their managed monthly solution is $1000/month - about $980/month more than I could even imagine for the few hundred docs a month I need to deal with...

I've emailed them anyway to see if they have a more limited version for cheaper, we'll see. Meantime if anyone else has any solutions for this that are more within the range of a small 10-person company I'd love to hear about it!

Thanks,

Todd
davidknapp
Registered: May 15 2008
Posts: 8
I suspected the price would be out of your range, sorry. We pay $15K per year for licenses, but HIPAA makes that just a cost of doing business. We also have 30 folks just doing look ups 24/7.

Another option, which I will check out, is that with a new scanner we recently purchased, a Kodak i260c I believe, we got a bunch of software that did include some OCR capability and the ability to make pdf files. It has versions of over-the-counter stuff like PaperPort plus some Kodak written software. Have you looked at PaperPort? The scanner was about $900, which also may ruin your $20 bucks a month budget, but the scanner also included the software. The scanner handles batch feeding, which many don't.
tractionsoftware
Registered: May 3 2007
Posts: 27
Hi,

What I would do is scan all your documents to a pdf file, open in Acrobat and use the inbuilt Acrobat OCR or use Adobe Capture product to OCR the PDF.
Then use a tool called 'PDF Content Split' which we provide for splitting the pdf on text to that text filename. only costs $99.95 U.S.

free trial available here:
http://www.traction-software.co.uk/pdfcontentsplit

Lee.
Traction Software
http://www.traction-software.co.uk