This tutorial shows you how to work with the Accessibility features in Acrobat X. See what the all-new Acrobat DC can do for you.

Download a free trial of the new Acrobat.

Accessible PDF Files from Microsoft Word using Acrobat X

Learn best practices for creating accessible PDF files from Microsoft Word in Acrobat X Pro and Suite.

By UVSAR – February 21, 2011

 



In this tutorial, learn best practices for creating accessible PDF files from Microsoft Word in Acrobat X Pro and Suite. It includes specific tips on how build tables, styles and text boxes in Word that are accessible in a PDF document.
View transcript

Accessible PDF Files from Microsoft Word using Acrobat X

UVSAR February 21, 2011

Hi, I'm Dave Merchant from the Acrobat User Community, and today I'd like to introduce some of the best practices when you're making accessible PDF files from Microsoft Word.

I'll be using Acrobat X and Office 2007 today, but the same techniques will work in earlier versions.

Accessibility is not simply about catering to users who have screenreaders.

The main thing an accessible PDF file has is a system of tags showing the logical order of elements - that's the order in which you as a human would read the content of a document, following an article as it flows between columns or different boxes on a page, understanding the difference between a heading, caption, a table and a block of paragraph text.

When these tags have been defined, the software understands how to put this content back together into a single flowed column that's used by screenreaders and Braille printers, but also mobile devices will often reflow a page into a single column to display on a small screen and they need to know the order to put things into.

When you're copying content or exporting part or all of a PDF file into a different format, the same rules apply and you'll often find if you try to select text on a non-tagged PDF file you can get some very strange things happening, because there's no inherent order in the content of a PDF as for example there is in a web page.

Without a tag system, the software has to guess the order, and it uses page position: left to right, top to bottom.

For most documents that isn't going to work properly.

It will also include everything - screenreaders on an untagged document will for example read all the headers and footers - you don't want that to happen, so the tag system also has an important role in defining Artifacts - these are elements which we don't want screenreaders to read, and we don't want copied when we're pasting the contents of the document.

It also makes sense to do as much of this as possible in Word, so the least amount of work needs to be done on the PDF.

PDF is an end destination format, so you want to be able to change your Word document, save a new PDF, and not have to start re-applying all your tags in Acrobat.

It's possible to do everything after the fact in the PDF, and we have other videos to demonstrate that, but ideally we want it done in the original file.

Some of these workflows might seem a little onerous, but once you get into the habit they actually make far better and more technically-correct Word documents, even if you never make them into PDFs.

We're going to make sure we use styles and headings properly, we'll only ever use tables to display grids of data, not to lay out columns of text on a page, we need to add descriptions to visual elements such as images, make sure hyperlinks behave properly, we need to fill in the document properties in Word so the PDF has a title, description and keywords; because people who are blind need to be able to search for PDF files without relying on a thumbnail image, and we need to be careful what we do with headers and footers - by default when you export a tagged PDF from Word the headers and footers will be made into Artifacts, so they'll be ignored by the screenreader and by copy/paste operations.

When you run an accessibility check if you've got active things in your headers and footers such as hyperlinks, that can start causing problems because that will be flagged as something the user might need to click on, but which is in an inaccessible part of the document.

Finally we need to make sure we export using Acrobat's PDFMaker plugin, not the print menu.

If you print to PDF from any file format, you simply get a virtual printout - all of this internal structure is removed - so we need to make sure we Save As rather than Print.

Now our goal is not solely to pass the Accessibility Checks.

This document has tags, it's been exported from Word, it will pass every Accessibility Check we throw at it, but it's broken - quite horribly broken.

If we look at it from a distance, you can see we have an article that flows down the right hand side, and another article in blue in the bottom left corner, but if I start trying to select something - I'll grab this area of text - we're OK up to the word "team".

I should be selectingthe word "Inadvertent" next in the sequence, but I actually start grabbing all of this extra stuff on the left.

The logical order might be present, but it's still wrong - and that's one of the issues with the Accessibility Checking tools; they don't understand the meaning behind content, they simply check if the tags are present.

The reason this isn't working properly is down to the way we constructed it in Word, so let's have a look at our Word file.

The reason all of this stuff is being selected in the wrong order is because we used a table.

If I turn on the borders you can see that we have a set of cells with the text dropped in.

Tables are always read left-to-right, top-to-bottom in row order, so when the Accessibility tool reaches the word "team", the next thing it's going to do is open the table and start with this first cell, and start reading the word "Safety".

When it gets to the end of this cell, it'll start reading the word "Inadvertent" and so on.

So, tables cause a huge problem with layouts - we should really be using a text box and letting the rest of the page content flow around it.

You'll also notice that there are some blank lines; instead of using paragraph spacing to define our paragraph breaks, we just put an empty line in - that's going to cause an empty tag in the PDF.

It's not a critical issue but it can cause problems when you're copying and pasting parts of the document.

If we look at styles, this style at the top is defined as a "Title", which is fine in Word but a Title isn't actually a Heading, so it isn't going to be transferred into Acrobat as a Heading tag.

This bit here isn't even a heading - it's the Normal style with manual character changes.

That's something we want to avoid - we want the document to identify that this is Heading 1, Heading 2, Heading 3, etc.

With tables, we also need to be careful when we produce a genuine table that we take allowance of the reading order.

You can organize the tags in Acrobat to read a table column-by-column, but by default it will go row-by-row.

If you look at this table there's no particular reason why we have to have it laid out in this order, but if we read this in Acrobat it would say John Jacobs Marta Robinson Conrad Simms Marketing Manager Events Manager Site Manager 866 ...

Not the easist thing to understand!

If we swap the rows and columns, it will actually say John Jacobs Marketing Manager 866...

etc.

So rather than trying to do advanced fiddling with the tags, sometimes we just rearrange our content to go with the flow, quite literally.

Finally at the very bottom, there's our footer and we have a website link.

Acrobat will convert that automatically into a hyperlink, but it won't style it, and also all the headers and footers, when we convert to PDF, are going to be flagged as Artifacts - so the Accessibility Checker is going to complain there's a hyperlink in an inaccessible part of the page.

Similarly with these email addresses, they'll be converted automatically in Acrobat and Reader, but we need them to be genuine links so they have an indentifiable style applied to them, to make them look and behave different to the rest of the document page.

So let's look at a version where we've fixed all of this,and done things properly.

Now this shouldn't look a great deal different - that's the plan - but we've followed all the rules this time.

Our headings use genuine Heading styles so they're going to be recognised as Heading tags in the PDF.

Instead of using blank lines to separate the paragraphs, we've now added a gap at the start of each paragraph in the Normal style.

If I click the Format button and choose Paragraph, that's where you get at the options and I added 8pt before each paragraph.

That gives me the gap without putting extra content in.

The main thing is instead of using a table to flow this text, I've not put in a text box to hold my second article.

I've given it a border so I'm not relying on font color to differentiate between the two articles - that's a Section 508 thing - so the main text now flows past in a single sequence, and I've used the row-by-row order for the table to reflect the way the screenreader is going to read it (so I don't have to start fiddling with the tags).

I've also taken the web link out of the footer where it would have been Artifacted, and put it in the main body of the document so it's now accessible in both senses of the word, but it's still not a web link - it's just text on a page and I need to convert it into a hyperlink and also do these email addresses.

There is an AutoFormat tool in Word, but it's not on the Ribbon.

So, open the button next to the Quick Access Toolbar and choose "More Commands", then in the dropdown list choose "Commands Not in the Ribbon".

Scroll down a little bit and you'll find AutoFormat...

- grab that and say Add, and click OK.

That now gives you the tool even though it wasn't in the Ribbon.

I can click AutoFormat, and under Options the only thing I want checked is the last box - formating Internet links.

All the styles I've already done, I don't want those rearranged.

So I'll AutoFormat Now, click OK, and there's all my email addresses converted and the Web address at the top, all in one go.

Cool trick, worth remembering.

Finally, Alt tag for this image: right-click the image and choose "Size", and the Alt text lives on that tab - that's where you put in the description the screenreader will announce when it gets to that image.

Every image needs one or the Accessibility Check will fail.

So we're good to go.

I'll save the file, and under the Acrobat Ribbon under Preferences, the critical thing is to make sure this last box is ticked so we turn on tagging.

If we don't put tagging in, frankly all of this has been a waste of your time and mine; and we also, under bookmarks, can specify which of the bookmark styles are going to be converted - in other words at the moment the Heading 1 elements are going to be converted into a PDF bookmark, nothing else is.

I can turn these off and on at will just by clicking the boxes.

In a large document it's important to add bookmarks so people can find sections easily without having to read the entire document from start to finish.

I'll leave this as it is even though it's just one page, and I can create my PDF either by clicking this button, "Create PDF", in the Ribbon, or from the main Office button I can choose Save As > Adobe PDF.

I do not want to go through and print to PDF because I'll lose all the tags!

So, Save As > Adobe PDF - save it, and view it in Acrobat.

Well it certainly looks the same as it did 5 seconds ago, so that's a good start, and we can select our text and it now flows around as it's supposed to.

If we want to see the tag structure, then from the Navigation SidePane, if we right-click and choose "Tags" we can open the tags and see how they've been assigned; and if we pull down this little Options menu and choose Highlight Content, then as we select a tag it'll get a blue border on the page so we can see the reading order a screenreader or reflow device would use.

Figure > Heading > Heading > Heading > Paragraph > Paragraph > Paragraph (marvellous!) > Table > Paragraph > Second Article, quote at the bottom; and you'll notice there isn't a tag for the text in the footer because it's being treated as an Artifact.

You may wonder how we got that text box to sit where it is in the flow, between the last paragraph and the quote, and the answer is quite simple - whenever you insert a text box in Word it's attached into the flow based on where your cursor was, so when I drew that text box, my cursor was sitting at the end of the word "direction" - so that's where it's been inserted in the flow.

You can move it around but it makes sense to get it right to start with.

Now there's two things we need to do to this file to get it past the standards; firstly under File > Properties (cmd/ctrl D) on the Advanced tab, make sure we've got the language defined - sometimes it comes through, sometimes it doesn't.

Also when we start pressing the TAB button on our keyboard, we select all the links in the file one at a time.

We're doing it in the right order, but Word doesn't assign an order to hyperlinks so at the moment Acrobat is guessing and we need to confirm it's guessing it correctly.

From Pages, select one or all of the pages, right-click and choose Page Properties, and agree to use the Document Structure - i.e.

the Tags - when you're deciding the order of the TABs.Click OK, save the file and we're done and dusted.

We can run an Accessibility Check to prove the point - open the Tools Pane, bring open Accessibility if it's not already visible.

It'll always pass a Quick Check because that just looks for tags.

If we run a Full Check, making sure we're showing the report so we can see the results, I'll start running Adobe's own check and click Start Checking.

It's passed - what a surprise!

How about Section 508?

Oooh.

Failure!

Well, no actually it hasn't.

If you look at the summary, everything has either passed or was not applicable apart from Part (c).

Part (c) of s508 says whenever you're using color to highlight the meaning of something you have to have an alternative way of telling that content has a special meaning.

For example when dealing with our web links, we're blue but we're also underlined - so we are following the color rules, but the software can't check that because it doesn't understand the meaning behind the text it's looking at.

So under the Full Check we can turn off part(c) and say we've checked that ourselves, and Acrobat will confirm the document passes everything else - so it is completely s508 compliant, we just had to help out to prove the point.

Now this is a long video but not nearly long enough to cover such a huge topic as accessibility, so it'd be nice if there was somewhere you could go where all the resources were gathered together for you, and you've already got the link!

From the Help menu in Acrobat, choose Online Support > Accessibility Resource Center.

That will open Adobe's Accessibility Resource Center with details of how to work accessibly with all their applications, including Acrobat and PDF, and how to deal with advanced topics such as tagging tables so they flow in a different order, etc.

I've been Dave Merchant for AcrobatUsers.com - thanks for listening!



Products covered:

Acrobat X

Related topics:

Accessibility

Top Searches:


0 comments

Comments for this tutorial are now closed.

Comments for this tutorial are now closed.