Hi,
I was wondering how clearscan is affected by the amount of memory the computer has? We have almost 200,000 PDFs without text.
I'm planning to use a dedicated computer (PC) to batch process the files.
Thanks.
Ira
Hi,
I was wondering how clearscan is affected by the amount of memory the computer has? We have almost 200,000 PDFs without text.
I'm planning to use a dedicated computer (PC) to batch process the files.
Thanks.
Ira
The use of a dedicated box would be prudent.
OCR is a resource intensive process.
Do not expect to be able to use a box, doing OCR, for anything else.
Process small quantities of files rather than trying to do all at one time.
(Big junks will result in a locked box - and it'll take time to figure out which files need to be done again.)
Do not point the installed Acrobat on the dedicated box to stuff out in network space.
Network "burps" will result in lost processing time &, again, time to figure out what got through and what did not.Put all the files on the local HDD or an attached USB device.
Start with small numbers of files and bump it up to identify an optimal quantity of files that can get done without spin-crash-burn.
Say this is 20 to 30 files. After about 6 passes stop, close Acrobat, open Acrobat and restart.
Once or twice a day, close out all. Shut down the box. A few minutes later start up and get the work flow going again.
Yes, a "pain" -- better than the mess and lost time determining where you are when you get a locked box.
Yes, that can and most likely will happen if you bite off more than Acrobat/Windows/the local machine's resources can chew.
Variables of concern:
Local machine resources (are unnecessary processes turned off - something that can be overlooked).
Size of PDFs to undergo OCR - bigger takes more resources and time.
OCR wants as much RAM as it can get & writes frequently to the HDD
Turn off screen saver, snooze, sleep, etc.
Alternatives
I've used Adobe Capture Cluster in the past. On a dedicated box. Worked (and still does work) very nicely.
Minimal attendance by a "warm body" is need.
OCR applications similar to Capture Cluster, on a dedicated box, will perform in a like manner.
But, the best bet for "big", unattended jobs would be Server based applications.
Three "providers" that come to mind are Adobe, Abby FineReader, or AdLib.
Be well...
Be well...