These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Problem with Reading Order in Text with columns

acaya
Registered: Mar 10 2009
Posts: 3

I have a document downloaded from Internet (http: // www.plus.es/guiatv/iguiaprogramacion.php).
I need to pass it as table to Excel or DataBase.
As you see this document has 5 columns in journalist style. From top to down, each column, from the most left to the right.
When I try to read this document with IAC and Javascrip (jso.getPageNthWord)
It give me first the columns in reverse order, the first read is the last column at the right.
If I try to select text with (PdDoc. CreateTextSelect) does not return text to me though I have tried with different values of "Rec".
The file does not have Tags, but the order that appears in Touch UP Reading Order start with the last column of the right.
I can create Tags and change the order of reading manually, page to page, but the file has 128 pages and I can't.
Are there some procedure in the interface or by program from (IAC, JavaScrip, PlugIn) of changing the order of reading for this 128 pages of this file.
I use the version 7.0 but I try with version 9.0 whith the same results.

My Product Information:
Acrobat Pro 7.0, Windows
daka630
Expert
Registered: Mar 1 2007
Posts: 1420
Hi acaya,
A tagged output PDF is required for reorder, reflow and accessibility.
For the 5 column tv program another plus would be if the authoring/source file was
setup to support threading by article(s).

It seems that the PDF was produced by a 3rd party approach so you may be
stuck with a "dumb" PDF. I suspect you will have to use Acrobat Professional to apply tags
and then manually manipulate the reorder and/or layout articles.

While Acrobat Professional 7 is ok for producing a tagged PDF; Acrobat Professional (8.1.3) or better would
be my choice.

Because the file contains columns of text and is not a table you do not have any underlying
column/row structure. Content transfered into a spreadsheet will tend to be a bit messy.

Looking at the PDF with Acrobat 8's TURO I see no "reorder" which is consistent with the PDF
having no tags.
After adding tags with Acrobat the Recognition Report identifies:
112 reduced confidence pages
2114 figures missing AltText
To fix -
Region Tagging
Paragraph Tagging
List Item Tagging
Table Row Tagging
Table Header Cell Tagging
Table Data Cell Tagging

The resulting structure tree needs *much* editing.
The April program may be available sooner. 8^)

Be well...

Be well...

acaya
Registered: Mar 10 2009
Posts: 3
Thank you for the response, but I have not understood if you are giving me a possible solution.

Already I know that is needed to form the reading order of the pages by adding Tags.

Already I have done it on a page and it works, my problem is that I need to do it in 128 pages and in addition there is a new file every month.

I need some way to be able to automate the process.
Would you say me if this is possible, and how can do that?

I'm considering JavaScrip, Pluh-ins but I can't see it
daka630
Expert
Registered: Mar 1 2007
Posts: 1420
Hi acaya,

Quote:
Already I have done it on a page...
I need to do it in 128 pages...
I need some way to be able to automate the process.
For read order, I strongly suspect that, with one page done "by hand", you have 127 more to do.
Then repeat every month.

If the source of the PDF is not created in an application that has strong support for tagged output PDF then you add tags manually or "automate" using Acrobat to "Add Tags".
The "Add Tags" feature often does not determine the appropriate logical read order
for complex layout (which is what the program guide has).
Cleaning up the read order is still a manually process; even in a "well formed" tagged output PDF
(which the tv program PDF is not).

But, you never know what is around the corner.
Perhaps a Acrobat plug-in application that addresses read order can be or has been created.

Be well...

Be well...