Answered
How can one go about finding "2 element(s) with alternate text but no page content" in a 600 page complex pdf file? Thank you!
How can one go about finding "2 element(s) with alternate text but no page content" in a 600 page complex pdf file? Thank you!
However, if you step back and consider the implications of the information a path presents itself.
.
First, Alternate Text description is typically associated with the <Figure> element.
n.b., IAW ISO 32000 the Alt key can be used in any structure element's dictionary - not just <Figure>.
.
Second, as the report identifes a "Figure" having Alt Text but no content, we know the Figure elements (tags) that have alternate text are present in the structure tree.
Third, we know the Figure elements' have no association to underlying PDF page content (structure elements and PDF page content are separate).
Fourth, as a consequence, we know that there will be Figure elements in the structure tree that have no nested Container icons (the "Bankers' box").
Fifth, these Figure elements are the culprits.
.
A pathway to locating the culprits:
When the Tags panel is open —
In some cases, what may facilitate location of Alt Text to a given page is to export the Tagged PDF to Text (Accessible).
The text string(s) that comprise the Alternate Text will be present.
In the text file, a square box denotes the end of content that was exported from the PDF file.
You can compare, page by page, what is in the text file to what textual content is in the PDF page.
.
Be well...