These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Switching Off Automatic OCR

elliotdm
Registered: Nov 18 2009
Posts: 4
Answered

On certain pdf files, created electronically, rather than scanned, as far as I can tell (they're from 3rd parties), Acrobat 9.2.0 Pro Extended performs automatic OCR every time a page is viewed.

Is there a way to prevent/switch this off? Even whilst scrolling through the document, it doesn't "remember" the OCR it performed the last time you scrolled past the page.

There don't seem to be any settings in the preference options to turn this function off.

This makes affected documents very slow read and is intensely frustrating. Any help would be appreciated.

Thanks,
Derek

Environment: WinXP + SP3
Machine: HP Compaq 6735s

My Product Information:
Acrobat Pro Extended 9.2, Windows
lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
Can you explain a bit more about what is occurring? Acrobat doesn't initiate OCR automatically. What dialog boxes are you seeing? Is Acrobat "reading" through the PDF?

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

elliotdm
Registered: Nov 18 2009
Posts: 4
It appears to be initiating OCR automatically in this case...

a small dialogue box pops up in the bottom right hand corner of the window, and goes through four stages...

"Deskewing image"
"Rotating Image"
"Decomposing Page"
"Recognising Text"

Since this happens at each page, it seems Acrobat is reading the document, and takes quite a time to do so (~10 seconds per page).

There are no "images" as such, just text, created via Acrobat PDF distiller from Microsoft Word (i assume), rather than scanned pdf images.
lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
Acrobat will only initiate this if the file is in some sort of image format (i.e., tiff, jpeg, bmp) when it is opened. Can you post an example? [url=https://www.acrobat.com/]Acrobat.com[/url] offers a free service to post files.

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

Ferdi
Registered: Nov 25 2009
Posts: 1
Good thread, I have the same problem. I can get the document to not do these steps while scrolling.

It slows down reading too much and is annoying. I use Windows 7 64 bit with acrobat standard 9.0

Hope someone has the answer

Regards, Ferdi

elliotdm wrote:
It appears to be initiating OCR automatically in this case...a small dialogue box pops up in the bottom right hand corner of the window, and goes through four stages...

"Deskewing image"
"Rotating Image"
"Decomposing Page"
"Recognising Text"

Since this happens at each page, it seems Acrobat is reading the document, and takes quite a time to do so (~10 seconds per page).

There are no "images" as such, just text, created via Acrobat PDF distiller from Microsoft Word (i assume), rather than scanned pdf images.
lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
Ferdi wrote:
Good thread, I have the same problem. I can get the document to not do these steps while scrolling.It slows down reading too much and is annoying. I use Windows 7 64 bit with acrobat standard 9.0

Hope someone has the answer

Regards, Ferdi
Make sure you've updated to 9.2 -- this was the first version to support Windows 7.

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

B_Hebert
Registered: Nov 25 2009
Posts: 5
From my experience with Acrobat 9.1 or 9.2 on various systems running Vista or XP Pro, it has nothing to do with the OS but rather, it seems to be machine specific. It does happen with various types of files as well as others have pointed out.
I have the exact same annoying problem on my system which now turns a 500$ piece of software almost unusable since it takes forever to move from page to page.
Definitely a bug.

Other threads dealing with this:
http://forums.adobe.com/message/1988985
http://www.techplex.net/acrobat/214100

Has anyone found a pattern to this?
lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
Can someone on this thread post and example of a file that exhibits this problem? Thx.

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

B_Hebert
Registered: Nov 25 2009
Posts: 5
Here is an example:
http://docs.blackberry.com/en/smartphone_users/deliverables/11164/BlackBerry_Bold_9700_Smartphone-User_Guide-T643442-643442-1005020122-001-5.0-US.pdf

Works well in Acrobat 8, not 9. Works well on one system, not the other.

BUT, most interestingly, if I open the above link under Acrobat 9.2.0 and view the file in the Internet Explorer plug-in, I can change pages instantly. Saving the file to disk and opening in stand-alone Acrobat reproduces the problem. Hmmm, interesting.

We're on to something.
daka630
Expert
Registered: Mar 1 2007
Posts: 1420
Just me nattering, but not all PDF producing agents are equal.
Some comply well with the PDF Reference documents (now ISO 32000-1:2008); some do not.
Most times, most "ill behaved" PDFs are, to a non-trivial extent, associated with an agent that is a little or a lot "off".
As PDF is now with ISO I suspect that more of the non-Adobe products will become more capable.
But that's down the road a piece.

The example PDF's "pedigree":

Application: XLS Formatter V3.4 R1
PDF Producer: Hyf PDF Output Library 2.3.0 (Windows)

Be well...

Be well...

lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
I had no difficultly opening the file and it didn't produce the dialog mentioned above. I did notice; however, that there were a few errors in the PDF syntax that seem to be associated with an Open Action. Perhaps the 3rd party tool isn't correctly writing the syntax and is causing something to occur with an Open Action.
Here is a [url=https://acrobat.com/#d=CBjyaDpK*ECxQrF3LtVzfw]link to the PDF Syntax information[/url] if you're interested.
Here is a [url=https://acrobat.com/#d=UZ8mfDM7iUy8UwTKJa2k6w]link to a version of the file[/url] that has had the syntax fixed. Let me know if you see any improvement in this version.

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

B_Hebert
Registered: Nov 25 2009
Posts: 5
Thanks for producing this file.
Unfortunately, I get the same slow "automatic OCR processing" (or whatever it is!).
I opened the new file from within the Internet Explorer plugin. Same as the original file, it does not produce the slow processing...

Just for fun, I printed a PDF from the original (slow) pdf. Although I lose the bookmarks, it's more convenient. I doubt the bookmarks have anything to do with the problem, but at this point, I don't know.
lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
It appears that something in the syntax is triggering the problem. This would make sense since you don't you see the issue in the browser. This is because the browser capability is a subset of the full Acrobat capability (i.e., you cannot inititae OCR in the browser).

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

rbogie
Registered: Apr 28 2008
Posts: 432
i experimented with the Blackberry file with AApro9.2.0, but could not reproduce the problem. however, i suggest that by applying 'reduce file size' (AA9) you may resolve the problem.
hinnymule
Registered: Nov 27 2009
Posts: 20
Same issue here. I just assumed that it was some setting that I could turn off, but I never could find the setting. I finally got annoyed enough that I did a Google search, which landed me here. It doesn't seem to be a very common problem, judging from the fact that I found VERY few relevant Google hits. I'm using 9.2 on WinXP SP3. I tend to read scholarly articles typeset with LaTeX. They contain no images; just lots of equations. Nonetheless, Acrobat "rotates, deskews, decomposes" every page when I scroll to it. This process takes about 2 seconds on each new page, rendering quick perusal (via scrolling) impossible -- or, at least, incredibly frustrating.

Not that it matters much, but [url=http://arxiv.org/pdf/physics/0311011]here's[/url] a pdf that I have this problem with.

HELP!!!!!
hinnymule
Registered: Nov 27 2009
Posts: 20
Two things that didn't work:

* I tried (as suggested by rbogie) "Document > Reduce File Size...". I tried all the options (i.e., compatibility with Acrobat 9, Acrobat 8, ..., Acrobat 4). Each time, I saved the "reduced" file, then reopened it. Still the same problem.* Since there's a command in Acrobat 9 for "Document > OCR Text Recognition > Recognize Text Using OCR...", I tried that. No improvement. Still rotates, deskews, decomposes every page. This is turning out to be more difficult to solve than I anticipated.
hinnymule
Registered: Nov 27 2009
Posts: 20
Here's a [url=http://picasaweb.google.com/dan.lacourse/UntitledAlbum02?feat=directlink]screenshot[/url] of the behavior in question.

[img]http://lh4.ggpht.com/_fFu7pTti3Es/Sw-TykOXCZE/AAAAAAAAAEo/ORWOKjuLoTU/s160-c/UntitledAlbum02.jpg[/img]
daka630
Expert
Registered: Mar 1 2007
Posts: 1420
Hi hinnymule,
A toss of the turkey bones leaves me with the thought that, perhaps, the issue is related to how the authoring application
applies equations.

Equation 1.11
Could be laid out a part of the text flow (ANSI No. 0103 from symbol font set, then the alpha-numeric characters from a standard font set)
As such, it would be expected to be renderable text in the PDF.

Equation 1.12
Laid out with a "equation tool" in the application (?).
If so, such typically gets "wrapped" in something.
When the output PDF is produced this "bundle" could be expected to be treated as a figure/image of non-renderable text in the PDF.
Encountering this non-renderable text, Acrobat processes it.

Be well...

Be well...

lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
hinnymule wrote:
Two things that didn't work:* I tried (as suggested by rbogie) "Document > Reduce File Size...". I tried all the options (i.e., compatibility with Acrobat 9, Acrobat 8, ..., Acrobat 4). Each time, I saved the "reduced" file, then reopened it. Still the same problem.* Since there's a command in Acrobat 9 for "Document > OCR Text Recognition > Recognize Text Using OCR...", I tried that. No improvement. Still rotates, deskews, decomposes every page. This is turning out to be more difficult to solve than I anticipated.
Are you using any plug-ins with Acrobat 9?

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

daka630
Expert
Registered: Mar 1 2007
Posts: 1420
Something else that occurs to me.
For Acrobat 8 -
Open the Acrobat Scan dialog.
(File > Create PDF > From Scanner)For Acrobat 9 -
(File > Create PDF > From Scanner > Configure Presets)
or
(File > Create PDF > From Scanner > Custom Scan)At the bottom of the dialog window, de-select "Make Searchable (Run OCR)".
Just because - close & reopen Acrobat and then open your PDF.
Does this have any effect?

Be well...

Be well...

lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
Apparently this is not for OCR, it's for Reading Out Loud. Try turning off View > Read Out Loud > Deactivate Read Out Loud.

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

hinnymule
Registered: Nov 27 2009
Posts: 20
Hi Ikassuba & daka630,"Read Out Loud" was already deactivated. I turned off "Make Searchable (Run OCR)". No improvement. Here are the plugins:
Quote:
Acrobat Accessibility
Acrobat Public-Key Security
Adobe DRM
Capture3D
Catalog
Comments
Compare PDF
Convert2AdobePDF
ConvertInDesign2AdobePDF
Database Connectivity
Digital Signature
DVA
ECMAScript
Editor
Forms
Highlight Server
HTML2PDF
Image Conversion
Import3D
Internet Access Plug-in
JDF ProdDef
Make Accessible
Multimedia
PaperCapture
PDDom
PDF Consultant, Access Check
Preflight
PaperCapture
Read Out Loud
Reflow
SaveAsRTF
SaveAsXML
Scan
Search
Search5
SendMail
Spelling
Table Picker
TouchUp
Updater
Web2PDF
Weblink
XPS Conversion
That's a lot. It'd be nice if Acrobat had a "Safe Mode" wherein I could launch it with the plugins disabled. In lieu thereof, how do I individually (& temporarily) disable a plugin?
rbogie
Registered: Apr 28 2008
Posts: 432
install Adobe Reader (any verson) and use Reader (not Acrobat) to view your document.
hinnymule
Registered: Nov 27 2009
Posts: 20
Hi rbogie,

I installed Adobe Reader 9.2. It helped. I still get the delay, but it is very slight. The little progress meter appears (when scrolling), but just flickers for a fraction of a second. This might be normal - not sure. It might have to do with "Assistive Technologies", about which I was forced to answer a whole bunch of questions even though I don't use any assistive technologies. *Sigh*.

Considering that the vast majority of users, like me, don't require assistive technologies, wouldn't it be sensible, on Adobe's part, to include a little check box that says "I don't use assistive technologies; set up Acrobat Reader accordingly"?

In particular, I seem to be forced to make a decision about how Reader should "infer reading order", which, I think, means the order that the document should be read [i]out loud[/i]. There is no option to just turn off this feature. Does that make any sense?
Why can't Reader just forget about "inferring reading order" until/unless I [i]ask it[/i] to read the document out loud?! ...Especially since I'd have to "activate" Read Out Loud anyway: (View > Read Out Loud > Activate Read Out Loud). Instead it [i]insists[/i] on inferring the reading order [i]right now[/i]. My only options are "Infer reading order from document (recommended)/Left-to-right, top-to-bottom reading order/Use reading order in raw print stream" with sub-options "Only read the currently visible pages/Read the entire document at once/For large documents, only read the currently visible pages".Even worse, if I choose the helpful-sounding "Use recommended settings and skip setup", then Reader chooses [i]preposterous settings![/i] For example, it un-checks the box next to "Display PDF documents in the web browser"! This is software designed by one of the world's biggest, smartest software companies?
hinnymule
Registered: Nov 27 2009
Posts: 20
As a followup, I think the flickering progress meter in Acrobat Reader 9.2 really is caused by its "inferring reading order", because if I change the preference (Document > Accessibility Setup Assistant...) to "Left-to-right, top-to-bottom reading order" and "Read the entire document at once", then I don't get the flickering progress meter. I can scroll through quickly and smoothly. Hurray!Except that, upon opening the document, I must wait 20s or so for Reader to "prepare the document for reading" -- even though I have no intention of asking Reader to read the document out loud to me. Is there no way to turn off Reader's obsession with Reading Order?
hinnymule
Registered: Nov 27 2009
Posts: 20
[b][u]THE SOLUTION[/b][/u]

[i]Accessibility[/i] was the problem -- in particular, it appears that Acrobat was rotating, deskewing, etc. as part of its insistence on "Inferring Reading Order" page-by-page.

The solution was:

* Edit > Preferences... > Reading
* Reading Order: "Use reading order in raw print stream".
* Page vs Document: "Only read the currently visible pages".

Thank you all for the help. It does not go unappreciated! Ikassuba was on the right track when she said "Apparently this is not for OCR, it's for Reading Out Loud. Try turning off View > Read Out Loud > Deactivate Read Out Loud." However, her solution wasn't quite 100%.I continue to contend that this whole issue only exists because of a truly horrendous design decision by Adobe: namely, to have Acrobat infer a "reading order" for every document automatically (and irreversibly) [i]even though the user has not indicated any desire to have the document read-out-loud to himself or herself.[/i]

I can think of no valid technical reason that the inference of reading order couldn't be performed only [i]after[/i] the user has asked to have the document read-out-loud. In the absence of such a technical reason, I can only speculate that this screwy design was implemented for legal reasons, perhaps having to do with the Americans with Disabilities Act. But that doesn't fully explain it either, because
1) Other pdf reading software (Foxit, etc.) doesn't do this.
2) It seems like overkill. My view of sight-impaired people is hardly so uncharitable
as to believe that they insist that the rest of us not be given the option to turn
off reading-order-inference.

Nevertheless, this has been a fun learning experience!

^
\
^
\
^-=<=<=<= P.S. I honestly didn't join this group in 1969!!
lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
If you want to open Acrobat without any plug-ins, just hold the shift key while launching the application. Not that you need it at this point.

It sounds like you have some sort of screen reading device on your system. Try disabling the Microsoft Narrator by going to Start>All Programs>Accessories>Accessibility>Utility Manager, or simply pressing Windows+U. Then Stop the Narrator.

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

hinnymule
Registered: Nov 27 2009
Posts: 20
Thanks Ikassuba,

Microsoft Narrator was indeed turned on. I turned it off. I thought that maybe with Narrator off, Acrobat would no longer infer reading order automatically, but that doesn't seem to be the case. Nonetheless, nice to know that it was on, and how to turn it off. And really nice to know how to start up Acrobat without plug-ins! Thanks!!
nightalon
Registered: Dec 3 2009
Posts: 1
Note that I first started having this problem in 9.2.0 on Windows 7 Pro 64-bit.

With one of the 9.1.x updates I used to get it only when I opened the document for the first time. After the update I got it every page, and sometimes twice every page. I thought it was the file's fault, but apparently it is the update's fault.

Adobe, this is a BUG! Please fix it.
munnever
Registered: Jul 5 2007
Posts: 3
I have followed all the above mentioned steps in this forum and this is still happening.

A few of our staff receive the attached pop up boxes when they are and downloading Adobe Documents from our online file cabinet 3rd party. We have Adobe Professional 9.0 on the latest update (9.2).

The process is as such:

The Content Preparation Progress Box appears
Then while scrolling through each page the three Deskewing, Rotate, and Decomposing pages appears on the bottom right of the screen.

I don’t get these messages and I have copied my Preferences Settings to one of these computers and this is still happening.

I have a Dell and they have HP computers. We are all on Vista SP1/2.

Please let me know what other options can I check on this. We do need Plugins for this 3rd Party to run that program.
distill
Registered: Apr 23 2010
Posts: 1
I just wanted to add that I had these horrible problems; long waits between page skips. I remember when new update came (maybe 9.3 or something), I was very busy but suddenly Acrobat gave the Advanced > Accessibilty > Setup Assistant treatment or similar. Very annoying. As with any other software, I really wish any updates would be completely invisible. I don't want to click any Update yes or no things, I don't want to wait for the update process and I certainly don't want to be forced to choose any new accessibility for the blind options.Anyway, Acrobat went really slow and there seemed to be several reasons. Spontaneous OCR (why?!), spontaneous read out loud functionality, Content preparation progress, some other useless processing. So, there is C:\Program Files\Adobe\Acrobat 9.0\Acrobat\plug_ins\Accessibility.api and ReadOutLoud.api. And I tried to rename those, but that didn't help! Also Advanced > Accessibility > Change reading options didn't help.Instead, this helped:

http://kb2.adobe.com/cps/328/328995.html (= Edit > Preferences > Reading > Only read the currently visible pages)
JazzBaby
Registered: Mar 30 2011
Posts: 1
I have a similar problem but what I want to do is electronically CREATE pdfs without having OCR active. I don't seem to have the problem READING the pdfs that other folks seem to have but when I save a MSWord doc to pdf, it automatically makes it OCR (going through all those steps noted previously) and causing the file to be HUGE. I figured out how to disable the OCR feature with scanned images and it reduced the file size of a 7 page document from 3Mb to 64kb. Now I just want to save a Word doc as a pdf without OCR. The last one I did was over 5Mb. Makes it difficult for some folks to open such a large file. I want to add my vote for Adobe Acrobat have the ability to turn off OCR when creating pdfs. Would also be nice to be able to set it as a default. Even with the scans, I have to disable it each time.