I have an online newspaper subscription that delivers the daily paper in pdf format.
I convert the newspaper to text and then perform an automated processing of the resultant file.
Recently when I "save as text" only a certain section of the pdf is converted. I can't tell what is causing this.
Is there something internally that is preventing me from saving the entire document as text? I can select and paste the desired text into a text document so it seems like the permissions are there. If I removed the permissions password would I regain the ability to save the entire document?
Here is a list of the document properties for the file:
Doc Restrictions:
Printing: allowed
Doc Assemlby: not allowed
Content copying: allowed
content copying for accessibility: allowed
page extraction, commenitng, filling of form fields, signing, creation of template pages : not allowed.
Document Security:
Doc Open Password : No
Permissions Password: Yes
Printing : High Res
Changing Document, Commenting, Form Field Fill-in/Signing, Document Assembly : Not allowed
Content Copying : allowed
Content Accessibility Enabled : allowed
Page Extraction: not allowed
Encryption: 128bit RC4
It is not a tagged pdf.
The PDF Producer is listed as CVISION Technologies
PDF Version: 1.6 (Acrobat 7.x)
I'm viewing it using Acro Reader 9 on Linux, however I've tested it on Windows with the same results of only partial text extraction.
The same document properties are found on an older file that converts the entire document.