Forum How do I...?

Preventing <text> elements in inline SVG elements from generating tagged text?

David J Prokopetz
Here's a weird one. We have a document with the following linline SVG:

<svg height="13pt" width="36.5pt" style="margin: -2pt 0;">
    <title>Death</title>
    <text x="1" y="15" fill="white" stroke="black" stroke-width="2">DEATH</text>
    <text x="1" y="15" fill="white">DEATH</text>
</svg>


(The doubled text is to work around the fact that many user agents don't support an outside stroke effect – it basically covers up the inner part of the stroke, leaving only the outside half.)

When the document containing this SVG is converted to tagged PDF, it results in the following tag hierarchy:

<Figure Alt="Death">
+--DEATH DEATH


Consequently, when screen readers hit this, they end up saying "death death death", when the intended audible output is simply "death".

In this case, the alternate text attribute produced by the inline SVG's <title> element should suffice; is there any way to suppress the production of tagged text by the accompanying <text> elements?
mikeday
Bug report: computer screams "DEATH! DEATH DEATH!" when file is opened :D
David J Prokopetz
Hah! I don't think I'd call it a bug, per se – there are absolutely situations where you would want an inline SVG's <text> elements to produce tagged text. It's one of those cases where both possible outcomes are situationally reasonable; it just happens that the one I'm getting isn't the one that's needed in this particular scenario.

Edited by David J Prokopetz

mikeday
This turns out to be a bit tricky, and a common issue with Acrobat text-to-speech using the "read out loud" feature:

https://community.adobe.com/t5/acrobat/looking-for-ideas-force-reader-to-read-alt-text-not-eps-content/m-p/9327023

Are you using Acrobat for testing? Apparently "real" screen readers are not supposed to have this issue, as the tagging of the structure tree is reasonable and would be encountered very commonly with a chart or graph or word art figure.
David J Prokopetz
I was able to reproduce the issue with NVDA as well. I haven't tried other screen readers on this particular document yet.
wangp
I've tested with NVDA 2020.2 and Acrobat Pro DC on Windows. On the following file it reads:
"This is the first line."
"Graphic This is the figure title. This is the figure description."
"This is the last line."

When the PDF is opened in web browsers:
  • Firefox: reads out the letters inside the figure instead of the alt text
  • Chrome: "heading level 2 death 1"
  • Edge: figure skipped over entirely
  1. t4464.pdf10.5 kB
David J Prokopetz
Hm.

When I try the sample file you've posted in Acrobat Pro DC, it reads "This is the figure title. This is the figure description.", but then *also* reads the contents of the text elements inside the figure. That's consistent with my results in the initial post (i.e., reading the "Death" alt text, then reading the two text elements inside the SVG, for a total of thee repetitions).

When I try it using NVDA 2020.2, it doesn't read the text elements inside the figure, but it does say "This is the figure title. This is the figure description." twice. That actually illuminates what's going on with my own document: if I give the figure totally unrelated alt text as a test, it reads the alt text twice, but doesn't read the contained text at all. I'd mistaken it repeating the alt text twice for reading the text contained within the figure, since the text contained within the figure repeats the same word twice.

Hm. So I guess the question *now* is what about the resulting tag structure is causing NVDA to repeat the alt text twice!

Edited by David J Prokopetz

mikeday
This sounds like a question best put to the NVDA developers, as at the moment we are kind of second guessing their intentions here; there might be changes we could make to the PDF structure tree but it's difficult to know what they actually expect given that the existing structure seems reasonable enough.
David J Prokopetz
Definitely, and I'll be taking that up with them.

However, that still leaves the originally reported issue that the text contained within the <Figure> tag does not reflect the figure's visual intent. The inline SVG uses two <text> elements with identical contents stacked on top of each other to achieve a particular visual effect, so to a sighted user, the text appears only once. This doesn't come across in the tagged text, which treats teach <text> element separately and thus repeats the text twice. That's something an assessor could decide to cite as a compliance failure when evaluating the document's PDF/UA-1 compliance, and they wouldn't be wrong to do so.

I guess the ideal solution in this specific case would be to somehow have the ability to suppress the production of tagged text from *exactly one* of the two <text> elements. Maybe I can work around it for now by converting one <text> element to vector paths and leaving the other as is.
David J Prokopetz
In any event, I was able to work around it with some really ugly scripting that extracts the text from the inline SVG, converts it into a span containing that text, then sets the SVG data it just removed as a background image of the newly created span. Prince doesn't currently tag <text> elements in SVGs used as backgrounds, so the resulting tag structure is simply:

<Span>
+--DEATH


This is probably literally the only scenario this would work in, but if it works, it works!

Edited by David J Prokopetz