[Xpath-ng] An elegant foundation (WAS: Re: Thoughts on work products)

Joe English jenglish at flightlab.com
Fri Nov 22 14:00:10 MST 2002


Jeni Tennison wrote:
>
> Simplicity is *very* subjective :) I guess I really mean "elegant"
> (which is even more objective!) rather than simple. I think that an
> elegant language is one in which there are a few simple fundamental
> rules that combine to make something that's very flexible and
> powerful.

The authors of HaXml [1] came up with a *very* nice foundation
for XPath/XSLT-like processing.  The basic building block is
a 'CFilter', which is simply a function 'f :: XMLNode -> [XMLNode]'
(For those who don't speak Haskell, that's a function that
takes a node and returns a list of nodes).

Two CFilters f1 and f2 can be glued together to form a new CFilter
that evaluates f2, applies f1 to each element of the result
(yielding a list of lists of nodes), then concatenates the
resulting nested list to get a flat list:

    o :: CFilter -> CFilter -> CFilter
    f1 `o` f2  = concat . map f1 . f2

Or, here's another way of writing the same thing:

    (f1 `o` f2) x = [ z | y <- f2 x, z <- f1 y ]


If we allow the node set vs. node list distinction to blur a bit [2],
then all of the XPath axes [3] fit neatly into this framework: they
take a context node and return a new list of nodes.  The XPath "/"
operator corresponds to filter composition (with the arguments
in the opposite order).

Name tests and node tests can also be written as filters:
these return a singleton list containing the context node
if it matches, the empty list if it doesn't.  XPath predicates --
except for those involving 'position()' or 'last()' --
can be handled the same way.

XSLT templates are also functions which take the current node
(plus some other stuff) and produce a list of result nodes.

So with just two building blocks -- filters and filter composition --
you can implement a substantial subset of the functionality of
XPath and XSLT.  Navigation (XPath axes), selection (node tests,
name tests, and [most] predicates), and construction (templates)
are all expressible as functions from nodes to node lists,
and they compose nicely.


> There's something about looking at the problem at a deep
> level, going right down to its foundations and building up from there,
> rather than taking a very narrow view and fiddling with only parts of
> the design at a time.

It's worth checking out what the Haskell community has been
doing; Uwe Schmidt's Haskell XML toolbox [4] is probably
the current state of the art.


[1] HaXml: <URL: http://www.cs.york.ac.uk/fp/HaXml/ >

[2] The set vs. list thing can be treated rigorously, but I'll
    skip that for now.

[3] HaXml actually only supports the "downward" axes, but
    HXML [5], my own variation on this theme, supports all of
    them except "namespace::*".

[4] Toolbox: <URL: http://www.fh-wedel.de/~si/HXmlToolbox/ >

[5] HXML:  <URL: http://www.flightlab.com/~joe/hxml >


--Joe English

  jenglish at flightlab.com



More information about the Xpath-ng mailing list