Reducing reliance on PDF documents online
Some organisations publish large amounts of their online content as PDF documents. But PDF is rarely chosen because it's been assessed as the best format for the content. In this article we discuss ways to reduce reliance on PDF.
PDF is often the format of choice for content publishers because:
- it's faster and easier to publish this way
- no special technical skills are required — if you can create a Word document, you can just as easily convert it to PDF
- content often starts out as a document and no one thinks of publishing it in any other way.
But if it's not used appropriately, it can create a range of problems.
Why reduce the use of PDF?
The push to reduce the use of PDF in Australia is because of concerns about accessibility. In 2010, the Australian Human Rights Commission warned against the use of PDF as the sole format for content after receiving many complaints. It said organisations should:
"make the content available in at least one additional format and in a manner that incorporates principles of accessible document design."
Poor document design
In the same year, the Australian Government Information Management Office also warned against relying on PDF documents. They said government agencies should:
"publish an alternative to all PDF documents (preferably in HTML)."
They then funded a study that found poor document design — not PDF technology — caused most accessibility problems.
My client work has shown that poor PDF implementations affect a wide range of users. While testing websites I've seen people struggle with:
- huge decorative images that create a barrier to reaching the real content
- spreads or landscape layouts that require horizontal as well as vertical scrolling
- multi-column layouts that increased the amount of vertical scrolling required, or lead to overlooking content that continues at the start of a new column (some users just read on to the top of the next page)
- long documents with no linked table of contents or bookmarks to allow them to click through to the section of the document they're interested in
- accidental clicking on words that are linked, but don't look like links
- clicking on text that looks like a link but isn't
- page number references in the text that don't match the page numbers in the document footer or in the PDF page menu
- trying to print or copy text from a page with print or copy restrictions applied (note: determined users can get around this, so don't use PDF because you think it stops people copying your content).
And even when a PDF document is well designed, I've seen users frustrated by:
- poor descriptions of what's in the PDF, making them think twice about whether they want to download it
- links not identified as a PDF download
- having to download a PDF document when all they wanted was contained in a single paragraph.
But it's not just PDF that's the problem. Word and RTF can also be poorly designed. They can be published without a table of contents, active links, heading styles or proper list formatting. They can have tabbed rather than styled columns. Neither format has been evaluated for accessibility support and both are likely to have shortcomings, particularly when it comes to forms and tables. Anyone who has ever converted a document with images into RTF will know that the file size increases significantly. And no one wants to download any sort of document when it's a simple page of text that could just as easily have been a web page.
So if you're publishing Word or RTF documents thinking this is an acceptable alternative to PDF — think again.
Poor PDF interaction skills
The Australian Government study also showed that disabled users sometimes lacked the knowledge or skill to interact with PDF documents. Again, I've seen this issue affect all sorts of users. Many people do not know how to:
- open the bookmarks pane to navigate a longer PDF that doesn't have a linked table of contents
- jump to a specific page in the document
- search for text within a document
- shut down the document without also shutting down their browser.
Strategies to reduce PDF
To create a better user experience for all users, we should reduce our reliance on PDF, and on other document formats — unless a document is specifically called for (as a print version of content or as a download for more in-depth reading, for instance).
Two recent approaches
In 2010, a state government department decided to remove all PDF documents from its website. With a significant budget allocated, its central web team managed the project, hiring staff to convert all PDF content to HTML. A quick look at the site now shows that PDF documents are back, though mainly as an alternative format to HTML.
A few weeks ago I heard about a large university taking a different approach. With encouragement from their Deputy Vice Chancellor, organisational units are reviewing their existing PDF documents, removing any that are out of date or unused and republishing the rest as Word documents wherever possible. While they are reducing PDF documents, they are publishing more Word documents, and as I hope I've shown, that does not necessarily mean increased accessibility or a better online content experience.
An alternative strategy
A better goal might be to limit the use of PDF to appropriate content types. Most organisations won't be willing or able to fund a wholesale conversion of PDF to HTML. And there is a case for publishing certain kinds of content as downloadable, printable or shareable documents. But don't just replace one document format for another. There's no evidence that Word or RTF is superior to PDF for accessibility.
Here's an outline of the steps you could take.
Get management support for your strategy. Find a senior manager who will champion the cause, making it a visible priority within your organisation. If there's no real impetus for change, old publishing habits will persist.
Do a sample audit. You need to have an idea of what you're working with. Identify the types and volume of content published as PDF, Word and RTF documents. Check the design and accessibility of these documents. Note common problems. Communicate your findings across the organisation.
Consult widely with content owners and publishers to consider the appropriate format for the different types of content they publish. Note areas of agreement and resistance. Consider the resourcing and skills required to publish in another format. Identify training needs.
Decide on format requirements for new content. You'll probably need to balance creating a good user experience with what is achievable given publishers' attitudes, resources and skills.
Work out how to handle existing content. Start with a full audit. Each business unit could do their own. The audit will need to identify documents that can be:
- removed (because they're out of date or unused)
- converted easily to HTML
- converted to HTML, but with more effort (longer or more complex documents, for example)
- kept, but after some document design improvements (tagging a well designed PDF, for example)
- kept as they are.
On larger sites, you will need to set priorities for dealing with existing content and do the work in stages, over time.
Support and monitor your strategy. Provide training and resources. Identify mentors who can support and advise others. Seek progress reports from business areas. Do regular, random audits. Monitor site analytics. Note any feedback from people who use your site. Be prepared to change tactics in response to any problems that crop up.
Keep the strategy on the agenda. Meet periodically with management. Keep everyone informed of how things are going. Report on success stories. Encourage and reward people's efforts.