Forum How do I...?

PDF/UA-1: Marking decorative generated content as an artifact?

dotherightthing
I am replacing the standard list bullets with unicode bullet characters:

ul,
ol,
li {
    list-style: none;
    padding: 0;
    margin: 0;
}

ul li::before {
    content: '\2022';
    display: block;
    float: left;
    width: 30px;
    font-weight: bold;
    text-align: center;
    margin-left: -30px;
}


This looks fine, visually:

Screen Shot 2019-09-27 at 10.55.12 AM.png


But when I check the output in Acrobat Pro, I see that the bullets are treated as real pieces of content, and that they appear before the rest of the page content in the content order

prince1.png


prince2.png


To resolve this, I would like to tag these bullets as artifacts.

https://www.w3.org/WAI/WCAG21/Techniques/pdf/PDF4 wrote:
The purpose of this technique is to show how purely decorative images in PDF documents can be marked so that they can be ignored by Assistive Technology by using the /Artifact tag. This is typically accomplished by using a tool for authoring PDF.

In PDF, artifacts are generally graphics objects or other markings that are not part of the authored content. Examples of artifacts include page header or footer information, lines or other graphics separating sections of the page, or decorative images.

Example 1: Marking a background image as an artifact using Adobe Acrobat 9 Pro's TouchUp Reading Order Tool

...

Example 2: Marking an image as an artifact in a PDF document using an /Artifact tag or property list

...


https://www.pdfa.org/wp-content/uploads/2019/06/TaggedPDFBestPracticeGuideSyntax.pdf wrote:
3.7 Artifacts

The process of laying out and paginating content for display can lead to the introduction of
additional display items (e.g. page numbers on each page or table borders). These items are not
part of what ISO 32000-1 defines as “real content”; they are considered artifacts of layout (see
14.8.2.2, “Real Content and Artifacts” in ISO 32000-1). A requirement for tagged PDF is to clearly
distinguish “real” content from artifacts. PDF/UA also makes it clear that artifacts must be
accessible, but it is less specific about precisely what is required for content marked as Artifact.
Artifact content must be accessible, therefore the basic rules of accessibility (see 3.2
“Fundamentals”) apply, including requirements for reading order and Unicode.


https://www.pdfa.org/wp-content/uploads/2019/06/TaggedPDFBestPracticeGuideSyntax.pdf wrote:
4.1.4.3 Creation

PDF 1.7 implies that the <NonStruct> structure element type or Artifact marker for marked content
sequences can be used to enclose dot leaders. Accordingly, it is recommended to avoid the
<NonStruct> element in this case, and simply mark dot leaders as Artifact instead.


Following https://drafts.csswg.org/css-content-3/#example-954890b1, I tried suppressing these decorative pseudo elements by setting alt to an empty string, but this resulted in them being omitted visually as well:

ul li::before {
    content: $cv-entity-bull / '';
}


Screen Shot 2019-09-27 at 10.58.30 AM.png


I'm going to try a different CSS approach for now, but regardless it would be useful to know if there is a proven technique for labelling items like this as artifacts.

Thanks,
Dan
  1. Screen Shot 2019-09-27 at 10.55.12 AM.png140.0 kB
    Visual layout is ok
  2. Screen Shot 2019-09-27 at 10.58.30 AM.png142.8 kB
    Bullets are suppressed visually as well
  3. prince1.png12.9 kB
    Bullets trump text in content order
  4. prince2.png122.6 kB
    Bullets trump text in content order
dotherightthing
Updating this,

1. Removing the 'float' property also removed the bullets from the content order.
2. I refactored the bullet styling to use ::marker:

ul,
ol,
li {
    padding: 0;
    margin: 0;
}

ul {
    margin-left: 7.9mm;
}

ul > li::marker {
    content: '\2022';
    font-weight: bold;
}


This allows me to reliably indent a custom bullet, though I don't quite understand how the space between the bullet and the text is calculated:

Screen Shot 2019-09-27 at 12.26.38 PM.png


More importantly, the bullets no longer appear in the content order, and they are correctly tagged as LI > Lbl:

Screen Shot 2019-09-27 at 12.29.14 PM.png


Screen Shot 2019-09-27 at 12.29.41 PM.png


Screen Shot 2019-09-27 at 12.30.34 PM.png
  1. Screen Shot 2019-09-27 at 12.26.38 PM.png29.4 kB
    Space between bullet and text
  2. Screen Shot 2019-09-27 at 12.29.14 PM.png142.0 kB
    Content order correct
  3. Screen Shot 2019-09-27 at 12.29.41 PM.png9.1 kB
    Content order correct
  4. Screen Shot 2019-09-27 at 12.30.34 PM.png7.7 kB
    Content tagged correctly

Edited by dotherightthing

wangp
I'm glad you found a solution. Using the ::marker pseudo element is correct. I would not consider the bullets as Artifacts.
dotherightthing
Thanks for the feedback. I'm surprised that the marker is treated as content given that it seems to be presentational, with the semantics conveyed by the list tags. I guess the situation could be different if the marker was a telephone icon or something like that.