Forum Bugs

Keywords list in PDF Metadata is not atomized

Johann
Given the metadata example from https://www.princexml.com/doc/prince-output/#pdf-metadata:

<html>
    <head>
...
        <meta name="keywords" content="cabbage, cooking, eating"/>
    </head>


This should define three keywords in the PDF document - "cabbage", "cooking" and "eating".

But Acrobat Pro just shows a single keyword "cabbage, cooking, eating"!

This is best seen in the Additional Metadata / Advanced dialog. The data structures are:

...
  <pdf:Keywords>"cabbage, cooking, eating"</pdf:Keywords>
...
 <dc:subject>
    <rdf:Bag>
       <rdf:li>cabbage, cooking, eating</rdf:li>
    </rdf:Bag>
 </dc:subject>
...


But they should be:

...
    <pdf:Keywords>cabbage, cooking, eating</pdf:Keywords>
...
 <dc:subject>
    <rdf:Bag>
       <rdf:li>cabbage</rdf:li>
       <rdf:li>cooking</rdf:li>
       <rdf:li>eating</rdf:li>
    </rdf:Bag>
 </dc:subject>
...

- - -
Johann

mikeday
We have changed the XMP serialisation of these values in the latest build, hopefully this helps!
Johann
Tried with build 20223009 - but the XMP data did not change!?

- - -
Johann

mikeday
Can you check the attached document which has been generated with --pdf-profile PDF/UA-1? It appears to have the correct XMP.
  1. xmp.html0.1 kB
  2. xmp.pdf2.4 kB
Johann
yes, the attached PDF is correct!

         <pdf:Keywords>cabbage, cooking, eating</pdf:Keywords>
         <pdf:Producer>Prince 20220930 (www.princexml.com)</pdf:Producer>


If you convert the attached HTML without any options then the output is still not correct:

         <pdf:Keywords>"cabbage, cooking, eating"</pdf:Keywords>
         <pdf:Producer>Prince 20220930 (www.princexml.com)</pdf:Producer>


Why should we require special options?

- - -
Johann

mikeday
Prince doesn't embed a copy of the metadata as XMP unless the PDF profile requires it, however in the default metadata dictionary the /Keywords field is just a single string and it looks like Acrobat just quotes the entire thing instead of treating it as a list.
Johann
Ok, that's reasonable. We will move to PDF/UA in the future, anyway.

- - -
Johann

mikeday
In the latest build we have added a new option --pdf-xmp-metadata that will force the inclusion of metadata in XMP format even if the current PDF profile does not explicitly require it.