Forum › How do I...?

Xinclude and Xpointer

Gismya
Hello!

I'm a total XML and Prince beginner and this is probably really simple, but after furiously searching for a while I can't understand how to use Xpointer to Xinclude only part of an XML file.

Basicly there are two use cases I'm after, taking only the part between the body tags of xhtml2 files and extracting chapters from the xhtml2 files. The files are structured like the following one: https://lagen.nu/sfs/parsed/2010/900.xht2


<?xml version="1.0" encoding="utf-8"?>
<html xmlns:rinfo="http://rinfo.lagrummet.se/taxo/2007/09/rinfo/pub#"
      xmlns:xi="http://www.w3.org/2001/XInclude">
    <head>
      <title property="dc:title">Lagtext</title>
    </head>
    <body>
      <section id="L2010:900" property="dct:alternate" content="PBL" class="kapitelindelning">
	<!-- Plan- och bygglag (2010:900), 6, 8-11, 13 kap; -->
	<xi:include href="900.xht2"/>
	</section>
    </body>
</html>


That's the full part of the current document I'm testing on.
<xi:include href="900.xht2"/>
in itself works, but I can't figure out how to add the Xpointer part of it. If this is such a simple task as I think this is, you are allowed to mercilessly mock me after helping me.

Cheers!
mikeday
Have you tried adding an xpointer="..." attribute on the xi:include?
Gismya
Yes,

<xi:include href="900.xht2" xpointer="K2"/>
or
<xi:include href="900.xht2" xpointer="html"/>

Gives error messages:
could not load file:///[fileurl], and no fallback was found
mikeday
So it works if the xpointer attribute is not there, but cannot load the file if it is?
Gismya
Exactly!
mikeday
Try xpointer="xpointer(html)". I think it needs the xpointer() to specify the scheme.
Gismya
xpointer="xpointer(html)" works, but using any other tag (For example xpointer="xpointer(body)") gives the same error as before, and none of

xpointer="xpointer(k2)"
xpointer="xpointer(#k2)"
xpointer="xpointer(id="k2")"

or anything that would point to an id works.
mikeday
Okay, this is a bit fiddly. To select the body element, you will need to use xpointer(html/body), or xpointer(//body). It has to be a valid XPath starting from the root element, not a partial XPath pattern as in XSLT.

To select an element by ID, you can use xpointer="foo" or xpointer="element(foo)". However, these will only work if the XML parser recognises the id attribute as being an ID. This will only happen if a suitable DTD has been loaded, or the xml:id attribute is used. This is a bit of a hassle.

Another option is to explicitly match the id attribute, like this: xpointer="xpointer(//*[@id='foo'])". This should definitely work, regardless of whether or not there is a DTD.
Gismya
Thank you so much! I finally got it! I don't feel so stupid as I thought I would after getting a solution, it was a bit tricky, but I must thank you immensely for your help!
mikeday
Glad that it works! It is actually quite tricky, as the XML parser does not give good error messages for XPointers that it does not understand.
LeifHalvardSilli
While we are waiting for the xml processor (Libxml2, right?) to recognize the id attribute of HTML5 documents as having "ID-ness", this was a great tip from Mike. Alll the other solutions I can think of (namely, to use xml:id or to to add a ID-ness declaration in the DOCTYPE) require you to do stuff to the included documents. And this stuff would also render each included documens as invalid, from e.g. the HTML5 NU validator’s point of view.