These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Web page into Tagged PDF

Mam
Registered: Apr 12 2007
Posts: 3

Any pointers on how to tag a PDF or set advanced properties while converting a webpage to PDF?

Regards,
Mam

My Product Information:
Acrobat Pro 8.1, Windows
daka630
Expert
Registered: Mar 1 2007
Posts: 1420
Something to try:
File - Create PDF - From Web Page
This gives you the "Create PDF from Web Page" dialog.
You can make some settings selections here.
Continuing -
Click on the Settings button (bottom right)
This opens the "Web Page Conversion Setttings" dialog.
Under the General tab there are panes: File Type Settings & PDF Settings.
For the File Type Settings, select HTML thencan click on the Settings button at the right.
In the HTML Conversion Settings dialog there are two tabs (General & Fonts and Encoding).
Browse each to see the configuration settings available.
(For instance, you can configure Multimedia options (disable, embed if possible, reference via URL).

In the bottom pane (PDF Settings) you can check/uncheck:
Create bookmarks | Place headers and footers on new page | Create PDF tags | Save refresh commands


Also:
Edit - Preferences - select Category "Web Capture"
This provides some settings for Open Options and for Conversion Options.



Be well...

Mam
Registered: Apr 12 2007
Posts: 3
Thanks for the reply.
I checked the PDF settings to 'Create PDF tags' before converting but html tags are not converted to respective pdf tags.

Regards,
Mam
daka630
Expert
Registered: Mar 1 2007
Posts: 1420
Mam wrote:
...but html tags are not converted to respective pdf tags.
Could you clarify what role mapping to PDF tags you expected?

With "Create PDF tags" selected a structure tree will be created.
The adequacy of this tree is determined the quality of the HTML input.

While PDF provides structure elements analogous to the methodology find in HTML, XML, SGML none of these are "equivalencies".
From HTML the tag will typically be role mapped to the PDF paragraph element .
Sometimes the HTML Heading tags will role map to the PDF elements ( through ).
Acrobat can be relied upon to map the HTML tag to the PDF element.
Of course, if the HTML table is a layout table then the PDF table element will be malformed which adversely affects the "health" of
the PDF structure tree.

In sum, a loose usage of HTML can make appropriate role mapping problematic.

Unfortunately, there is an over abundance of poorly authored HTML; consequently, repurposing the content (to PDF or other formats) reflects this.

What Acrobat tries to do in such cases is to map the input HTML tags (such as they may be) to a "best estimate" of the nearest equivalent.
This can be something like the silk purse from a sow's ear; not always too tidy, eh?

Be well...

Be well...