[Xpath-ng] FIXPath (XPath NG core) straw man draft posted

Uche Ogbuji uche.ogbuji at fourthought.com
Mon Jan 6 09:56:10 MST 2003


> 
> Uche Ogbuji wrote:
> 
> > [Joe English]
> > > That might be desirable.  The result of any multi-step
> > > location path will be a list in document order with no
> > > duplicates, which is compatible with XPath 1 node sets.
> >
> > You mean that is how XPath 1.0 node sets are often interpreted, right?
> 
> Right.
> 
> XPath 1.0 expressions "officially" return node sets,
> but in almost all use cases the result is processed
> sequentially (i.e., as a node list).
> 
> > To go back to my example.  Document:
> >
> > <a>
> >     <b>
> >         <c/>     <--- context node
> >     </b>
> >     <d/>
> >     <e/>
> > </a>
> >
> > In XPath 1.0,
> >
> > (ancestor::*/*)[1]
> >
> > Would return
> >
> > {c}
> 
> Is that right?   Since "(ancestor::*/*)" is parenthesized,
> the predicate "[1]" is evaluated "with respect to the child axis",
> i.e., in document order (see [XPATH] 3.3, first NOTE.)
> So it should return {a}, not {c}.

Frankly, that section of the XPath spec makes no sense, and has never really 
made any sense.  The statement "The Predicate filters the node-set with 
respect to the child axis" flat out contradicts the note that follows, which 
states that in the most common case, the axis that applies is *not* the child 
axis.

That having been said, if I squint at that section with enough leeway for 
logical inconsistency, it does come out closer to your position than to mine.

Even if that particular example is so resolved, you haven't yet convinced me 
that there are *no* examples where the merge behavior I proposed won't cause 
non-trivial incompatabilities.

I may be pursuing too much caution in this area, but contemplation of XPath 
1.0 axis arcana always does that to me  :-)

That is certainly one good reason it would be nice to eliminate such arcana.  
But I don't know whether we can do so without breaking too much.


> > > IOW, instead of merging the result lists, simply concatenate them.
> > And remove dupes.
> >
> > > I like this approach better, but it's a bigger change from XPath 1.
> > Hmm.  Or do you mean *don't* remove dupes?  I do think that would be too big
> > a change.
> 
> Right, I mean *don't* remove duplicates.  At least that's the
> approach I would favor if XPath 1.0 compatibility weren't an issue.
> Depending on how _important_ that issue is, sorting and removing
> duplicates might still be preferable.

I think so.

> Something else to consider:  although sorting in document order
> can be done fairly cheaply for static documents, (O(N log N) worst
> case, O(N+M) for merging), it's more expensive on mutable documents.
> AFAIK, the best algorithm for comparing the relative order of
> two nodes requires you to walk up and across the tree.  Unless
> someone has come up with an efficient way to keep cached ordering
> information up-to-date when modifying the document, that is --
> I'm not aware of any such algorithm.

This can be done with a relatively simple algorithm that keeps gaps in the 
document order indices for each node, filling these in as needed, and then 
triggering a recomputation up the tree (a re-balancing) when needed.  There 
are *many* approaches to this algorithm, and a variant of it is what we're 
using in development branches of 4Suite.

This is relatively efficient, adding a tiny bit of overhead to all mutation, 
and the occasional larger bit of overhead to some operations.  Actual sorting 
of arbitrary vectors of nodes is just a simple sort-list-by-index.


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
A Python & XML Companion - http://www.xml.com/pub/a/2002/12/11/py-xml.html
XML class warfare - http://www.adtmag.com/article.asp?id=6965
MusicBrainz  metadata - http://www-106.ibm.com/developerworks/xml/library/x-thi
nk14.html





More information about the Xpath-ng mailing list