Ibex now supports the PDF Universal Accessibility Standard and WCAG
For information on the standard see PDF/UA
Enabling PDF/UA Creation
To create a PDF/UA compliant PDF the FO file needs to have three things:
(1) a declaration of the ibex namespace on the <fo:root> element like this:
<fo:root
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:ibex="http://www.xmlpdf.com/2003/ibex/Format"
...
>
(2) a language declaration, which can be done on the <fo:root> element like this:
<fo:root
xml:lang="en-US"
...
>
(3) the FO file needs to include metadata surrounded by <ibex:pdfua> tags like this:
<ibex:pdfua>
<x:xmpmeta xmlns:x="adobe:ns:meta/"
x:xmptk="Adobe XMP Core 5.6-c01591.163280, 2018/06/22-11:31:03">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:pdfuaid="http://www.aiim.org/pdfua/ns/id/">
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="en">PDF/UA Document</rdf:li>
</rdf:Alt>
</dc:title>
<pdfuaid:part>1</pdfuaid:part>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
</ibex:pdfua>
This metadata includes the title "PDF/UA Document", change that to your own document title.
Once the three items above are included in the FO file Ibex will produce a PDF/UA compliant file.
A complete test file with one paragraph looks like this:
<?xml version="1.0" encoding="utf-8"?>
<fo:root
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:ibex="http://www.xmlpdf.com/2003/ibex/Format"
xml:lang="en-US"
>
<ibex:pdfua>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.6-c01591.163280, 2018/06/22-11:31:03">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:pdfuaid="http://www.aiim.org/pdfua/ns/id/">
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">PDF/UA Document</rdf:li>
<rdf:li xml:lang="en">PDF/UA Document</rdf:li>
</rdf:Alt>
</dc:title>
<pdfuaid:part>1</pdfuaid:part>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
</ibex:pdfua>
<fo:layout-master-set>
<fo:simple-page-master master-name="page" margin="1.5cm" page-height="297mm" page-width="310mm">
<fo:region-body column-count="1" region-name="body" margin="2.75cm 0.5cm 1cm 3cm" />
</fo:simple-page-master>
</fo:layout-master-set>
<fo:bookmark-tree>
<fo:bookmark internal-destination="header1" starting-state="show">
<fo:bookmark-title>Heading One</fo:bookmark-title>
</fo:bookmark>
</fo:bookmark-tree>
<fo:page-sequence master-reference="page" initial-page-number="1" format="i" font="12pt arial">
<fo:flow font="11pt arial" flow-name="body">
<fo:block font-size="larger" role="H1" id="header1">
Main heading
</fo:block>
<fo:block>
Hello world
</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>
The file created from the FO can be validated using the free PAC program. This tests various aspects of compliance and shows the results:
The tagged pdf tree structure can be viewed:
Note that the contents of the fo:page-sequence have been placed in a "Part" structure element. This
is optional, controlled by the Settings.PDFUA_PutPageSequenceAreasInPartElements
flag.
Headers
As shown in the above example any fo:block can have the "role" property set. To create a header use H1, H2 .. H6 as standards-compliant heading roles, like this:
<fo:block font-size="larger" role="H1" id="header1">
Main heading
</fo:block>
To disable the use of the "role" property when creating structured elements specify the 'ignore-role-attributes' property on the <ibex:pdfua> node like this:
<ibex:pdfua ignore-role-attributes="true">
Tables
Table elements are automatically tagged according to the following table:
Element | Tag |
---|---|
fo:table | Table |
fo:table-caption | Caption |
fo:table-header | THead |
fo:table-body | TBody |
fo:table-footer | TFoot |
fo:table-row | TR |
fo:table-cell | TD or TH |
Table cells inside a table header as tagged as TH. In addition:
- cells in table headers are given an "ID" property to identify them
- cells in the table body automatically have a "Headers" property which identifies which header cell(s) are relevant headings. There might be multiple if the cell spans multiple columns
- where are header has multiple rows, the cells in the lower rows have "Headers" properties which reference the cells in higher rows which cover the same columns
In practice this looks like the element tree shown below, where are TH element has "ID", "Role" and "Rowspan" properties:
And a TD cell element in the table body has a "Headers" property which matches the "id" property of the cell above it in the header:
Lists
List elements are automatically tagged according to the following table:
Element | Tag |
---|---|
fo:list-block | L |
fo:list-item | LI |
fo:list-item-label | Lbl |
fo:list-item-body | Lbody |
Static Content
The contents of fo:static-content elements is marked as "Artifact".
Image Alt Tags
You can specify an Alt tag to describe images with text using the "ibex:alt" property like this:
<fo:external-graphic src="RedbrushAlpha-0.25.png" ibex:alt="picture of tree"/>
WCAG Requirements
PDF/UA documents created by Ibex support the Web Content Accessibility Guidelines 2.2 standard.
The PAC program supports viewing WCAG compilance using the WCAG tab on the main screen:
Each of the WCAG 2.2 requirements which relate to PDF files are explained below. The descriptions of the requirements come from www.w3.org
PDF1: Applying text alternatives to images with the Alt entry in PDF documents
This requirement is satisfied. Ibex supports Alt entries for images using the ibex:alt property on each <fo:external-graphic> elements like this:
<fo:external-graphic src="RedbrushAlpha-0.25.png" ibex:alt="picture of tree"/>
Support for the same thing on the <fo:instream-foreign-object> element will be added soon.
PDF2: Creating bookmarks in PDF documents
This requirement is satisfied. Ibex supports creating bookmarks using the <fo:bookmark-tree> and <fo:bookmark> elements like this:
<fo:bookmark-tree>
<fo:bookmark internal-destination="CONTENTS67104136">
<fo:bookmark-title>Содержание</fo:bookmark-title>
</fo:bookmark>
...
This creates a bookmark tree in the PDF file like this:
PDF3: Ensuring correct tab and reading order in PDF documents
This requirement is satisfied. The order of text in the PDF follows the order used in the input formatting objects document.
PDF4: Hiding decorative images with the Artifact tag in PDF documents
This requirement is satisfied. The Artifact tag is applied to page headers and footers (specifically the contents of <fo:static-content> elements) and other elements such as table cell borders and backgrounds.
Elements marked with the Artifact tag can be viewed on the Artifacts tab of the Logical Structure view in PAC program:
PDF5: Indicating required form controls in PDF forms
This requirement is not applicable as Ibex does not support creation of PDF forms as they are not part of the XSL Formatting Objects specification.
PDF6: Using table elements for table markup in PDF Documents
This requirement is satisfied. Table elements are automatically tagged according to the following table:
Element | Tag |
---|---|
fo:table | Table |
fo:table-caption | Caption |
fo:table-header | THead |
fo:table-body | TBody |
fo:table-footer | TFoot |
fo:table-row | TR |
fo:table-cell | TD or TH |
PDF7: Performing OCR on a scanned PDF document to provide actual text
This requirement is satisfied. Ibex generates actual text rather than images of text.
PDF8: Providing definitions for abbreviations via an E entry for a structure element
This requirement is satisfied. The definition of an abbreviation can be specified using the ibex:abbrev property like this:
<fo:block>
<fo:inline ibex:abbrev="National Aeronautics and Space Administration">NASA</fo:inline>
goes to moon
</fo:block>
This can be viewed using the PAC program like so:
PDF9: Providing headings by marking content with heading tags in PDF documents
This requirement is satisfied.
PDF10: Providing labels for interactive form controls in PDF documents
This requirement is not applicable as Ibex does not support creation of PDF forms as they are not part of the XSL Formatting Objects specification.
PDF11: Providing links and link text using the Link annotation and the /Link structure element in PDF documents
This requirement is satisfied. A link annotation is created automatically when using a <fo:basic-link> element
<fo:block text-align="justify" text-align-last="justify" space-after="3pt" >
<fo:basic-link internal-destination="id8">1. Introduction
...
This can be viewed using the PAC program like so:
PDF12: Providing name, role, value information for form fields in PDF documents
This requirement is not applicable as Ibex does not support creation of PDF forms as they are not part of the XSL Formatting Objects specification.
PDF13: Providing replacement text using the /Alt entry for links in PDF documents
This requirement is satisfied. Atlernative text can be provided using the ibex:alt
property
of the <fo:basic-link> element
<fo:block text-align="justify" text-align-last="justify" space-after="3pt" >
<fo:basic-link internal-destination="id8" ibex:alt="basiclink">1. Introduction
...
This can be viewed using the PAC program like so:
PDF14: Providing running headers and footers in PDF documents
This requirement is satisfied using the <fo:table-header> and <fo:table-footer> elements.
PDF15: Providing submit buttons with the submit-form action in PDF forms
This requirement is not applicable as Ibex does not support creation of PDF forms as they are not part of the XSL Formatting Objects specification.
PDF16: Setting the default language using the /Lang entry in the document catalog of a PDF document
This requirement is satisfied using the xml:lang attribute on the <fo:root> element as shown here:
<fo:root
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:ibex="http://www.xmlpdf.com/2003/ibex/Format"
xml:lang="en-US"
>
PDF17: Specifying consistent page numbering for PDF documents
This requirement is satisfied using:
- the <fo:page-number> element for numbering pages
- the <fo:page-number-citation> element for showing the page on which some other content appears
- the <folio-prefix> element which specifies a prefix for page numbers, allowing you to have a page number such as "G-1" like this:
<fo:page-sequence id="glossary" initial-page-number="1">
<fo:page-number-folio-prefix>
<fo:inline>G-</fo:inline>
</fo:page-number-folio-prefix>
<fo:flow>...</fo:flow>
</fo:page-sequence>
- the <folio-suffix> element which specifies a suffix for page numbers
- the format attribute on the <fo:page-sequence> element which allows formatting of page numbers as letters or roman numerals like this:
<fo:page-sequence master-reference="page" initial-page-number="1" format="i">
PDF18: Specifying the document title using the Title entry in the document information dictionary of a PDF document
This requirement is satisfied.
The XML from which a PDF/UA document is created contains an entry like this:
<ibex:pdfua>
<x:xmpmeta xmlns:x="adobe:ns:meta/"
x:xmptk="Adobe XMP Core 5.6-c01591.163280, 2018/06/22-11:31:03">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:pdfuaid="http://www.aiim.org/pdfua/ns/id/">
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">PDF/UA Document</rdf:li>
<rdf:li xml:lang="en">PDF/UA Document</rdf:li>
</rdf:Alt>
</dc:title>
<pdfuaid:part>1</pdfuaid:part>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
</ibex:pdfua>
This entry is necessary for Ibex to identify the document as a PDF/UA document.
The <dc:title> element contains one or more document titles.
This XML is copied to the created PDF.
Adobe Acrobat will interpret this XML and display the title from the <dc:title> element like this:
but other PDF viewers do not do this, so in addition to adding the <dc:title> element to the PDF file Ibex copies it to the catalog document properties.
PDF19: Specifying the language for a passage or phrase with the Lang entry in PDF documents
This requirement is satisfied. Block and inline elements can have their language specified
using the xml:lang
property like this:
<fo:block xml:lang="en-GB">
this block is "en-GB"
<fo:inline xml:lang="de-DE">but this sentence is "de-DE"</fo:inline>
this block is "erun-GB"
</fo:block>
Where there are multiple languages used in a single paragraph this creates span elements in the document structure:
PDF20: Using Adobe Acrobat Pro's Table Editor to repair mistagged tables
This requirement is satisfied. The table elements and tags are shown here using the PAC program:
PDF21: Using List tags for lists in PDF documents
This requirement is satisfied. List elements are automatically tagged according to the following table:
Element | Tag |
---|---|
fo:list-block | L |
fo:list-item | LI |
fo:list-item-label | Lbl |
fo:list-item-body | Lbody |
PDF22: Indicating when user input falls outside the required format or values in PDF forms
This requirement is not applicable as Ibex does not support creation of PDF forms as they are not part of the XSL Formatting Objects specification.
PDF23: Providing interactive form controls in PDF documents
This requirement is not applicable as Ibex does not support creation of PDF forms as they are not part of the XSL Formatting Objects specification.
Feedback
Please send any feedback to support@xmlpdf.com.