Some simple examples of using Erlang’s XPath implementation

(This page is a mirrored copy of an article originally posted on the LShift blog; see the archive index here.)

We’ve been investigating the possibility of an XPath-based routing extension to RabbitMQ, where XPath would be used as binding patterns, and the message structure would be exposed as XML infoset. As part of this work, we’ve been looking at Erlang’s XPath implementation that comes as part of the built-in xmerl library.

Here are a couple of examples of Erlang’s XPath in action. First, let’s parse a document to be queried:

{ParsedDocumentRootElement, _RemainingText = ""} =
  xmerl_scan:string("<foo>" ++
                      "<myelement myattribute=\"red\">x</myelement>" ++
                      "<myelement myattribute=\"blue\">x</myelement>" ++
                      "<myelement myattribute=\"blue\">y</myelement>" ++
                    "</foo>").

(We could have used xmerl_scan:file to read from an external file, instead of xmerl_scan:string, if we’d wanted to.)

Next, let’s retrieve the contents of every myelement node that contains text exactly matching “x”:

69> xmerl_xpath:string("//myelement[. = 'x']/text()”,
            ParsedDocumentRootElement).
[#xmlText{parents = [{myelement,1},{foo,1}],
          pos = 1,
          language = [],
          value = “x”,
          type = text},
 #xmlText{parents = [{myelement,2},{foo,1}],
          pos = 1,
          language = [],
          value = “x”,
          type = text}]

Notice that it’s returned two XML text nodes, and that the “parents” elements differ, corresponding to the different paths through the source document to the matching nodes.

Next, let’s search for all myelements that have a myattribute containing the string “red”:

72> xmerl_xpath:string("//myelement[@myattribute='red']“,
            ParsedDocumentRootElement).
[#xmlElement{
     name = myelement,
     expanded_name = myelement,
     nsinfo = [],
     namespace = #xmlNamespace{default = [],nodes = []},
     parents = [{foo,1}],
     pos = 1,
     attributes = 
         [#xmlAttribute{
              name = myattribute,
              expanded_name = [],
              nsinfo = [],
              namespace = [],
              parents = [],
              pos = 1,
              language = [],
              value = “red”,
              normalized = false}],
     content = 
         [#xmlText{
              parents = [{myelement,1},{foo,1}],
              pos = 1,
              language = [],
              value = “x”,
              type = text}],
     language = [],
     xmlbase = “/localhome/tonyg”,
     elementdef = undeclared}]

This time, there’s only the one match. Finally, a query that no nodes satisfy:

75> xmerl_xpath:string("//myelement[@myattribute='red' and . = 'y']“,
            ParsedDocumentRootElement).
[]

If we had replaced the 'y' with 'x', we’d have retrieved a non-empty nodeset.

Comments

On 1 February, 2008 at 11:22 am, John Watson wrote:

Tony,

What do you think are the chances of getting an XPath routing exchange added to a future version of the AMQP spec? This sort of thing is precisely what I need.

On 1 February, 2008 at 4:47 pm, tonyg wrote:

We should experiment with adding it to RabbitMQ first, and making sure the design is sound - and once we have a solid, implementation-neutral proposal, we can ask the AMQP working group what they think of the idea. A proven, implemented idea has much greater chance of being accepted.

On 18 April, 2008 at 1:14 pm, Sopwith Camel wrote:

Why not expose the message as XDM and avoid the cost and additional transformation of going via XML Infoset?

(apologies for being required to use a nom de blog)

On 18 April, 2008 at 4:54 pm, tonyg wrote:

@Sopwith Camel: Good point. XDM is a richer model than Infoset. I’m not sure the cost would be significantly different, but there’s certainly an expressivity win. On the other hand, I don’t particularly fancy implementing XPath 2.0 :-)