Performance of various xpaths

View: New views
7 Messages — Rating Filter:   Alert me  

Performance of various xpaths

by robertsmme :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,
  I am applying a great number of xpaths to a very large document. The results are slooooow.
 
  Virtually all the xpaths are of the form //ns:parent/ns1:child  where child could be an attribute.
 
  I understand the reason for the performance hit is that the above xpath apparently requires a full document scan.
 
  With saxon xpath is this the case.  Would I get better performance using Saxon 9 over jaxen 1.1.1?  Is there a big hit because the underlying document is JDOM structure?
 
Thanks
 
Martin

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@...
https://lists.sourceforge.net/lists/listinfo/saxon-help 

Re: Performance of various xpaths

by Michael Kay :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

It's not clear from your question whether the slow performance you are reporting is coming from Saxon or from Jaxen.
 
Saxon has an optimization for XPath expressions starting //x - the first time such an expression is used for a given element name x, it scans the document to find all the x elements, and on subsequent requests, this index is reused. However, this only works for native Saxon tree models (Tiny Tree and Linked Tree), it doesn't work for external models like DOM or JDOM (though actually, it would be easy enough to implement).
 
For this and other reasons, Saxon will usually perform quite a bit faster on its native tree implementation than on third-party tree models.
 
If you have predicates in the path expressions, there are potential further gains from the join optimizer in Saxon-SA. However, if you're executing individual path expressions from a Java application there's a limit to what Saxon can achieve because it can't see the whole picture at compile time.
 
Michael Kay


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of martin.me.roberts@...
Sent: 11 July 2008 11:07
To: saxon-help@...
Subject: [saxon] Performance of various xpaths

Hi,
  I am applying a great number of xpaths to a very large document. The results are slooooow.
 
  Virtually all the xpaths are of the form //ns:parent/ns1:child  where child could be an attribute.
 
  I understand the reason for the performance hit is that the above xpath apparently requires a full document scan.
 
  With saxon xpath is this the case.  Would I get better performance using Saxon 9 over jaxen 1.1.1?  Is there a big hit because the underlying document is JDOM structure?
 
Thanks
 
Martin

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@...
https://lists.sourceforge.net/lists/listinfo/saxon-help 

Re: Performance of various xpaths

by robertsmme :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Michael,
  Thank you for your response.
 
  Do you have any API that work the other way round - i.e. given a node in a document you can tell if an XPATH matches the node?
 
Martin


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of Michael Kay
Sent: 11 July 2008 11:24
To: 'Mailing list for the SAXON XSLT and XQuery processor'
Subject: Re: [saxon] Performance of various xpaths

It's not clear from your question whether the slow performance you are reporting is coming from Saxon or from Jaxen.
 
Saxon has an optimization for XPath expressions starting //x - the first time such an expression is used for a given element name x, it scans the document to find all the x elements, and on subsequent requests, this index is reused. However, this only works for native Saxon tree models (Tiny Tree and Linked Tree), it doesn't work for external models like DOM or JDOM (though actually, it would be easy enough to implement).
 
For this and other reasons, Saxon will usually perform quite a bit faster on its native tree implementation than on third-party tree models.
 
If you have predicates in the path expressions, there are potential further gains from the join optimizer in Saxon-SA. However, if you're executing individual path expressions from a Java application there's a limit to what Saxon can achieve because it can't see the whole picture at compile time.
 
Michael Kay


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of martin.me.roberts@...
Sent: 11 July 2008 11:07
To: saxon-help@...
Subject: [saxon] Performance of various xpaths

Hi,
  I am applying a great number of xpaths to a very large document. The results are slooooow.
 
  Virtually all the xpaths are of the form //ns:parent/ns1:child  where child could be an attribute.
 
  I understand the reason for the performance hit is that the above xpath apparently requires a full document scan.
 
  With saxon xpath is this the case.  Would I get better performance using Saxon 9 over jaxen 1.1.1?  Is there a big hit because the underlying document is JDOM structure?
 
Thanks
 
Martin

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@...
https://lists.sourceforge.net/lists/listinfo/saxon-help 

Re: Performance of various xpaths

by Michael Kay :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

If the XPath expression takes the form of an XSLT pattern, then you can compile it as a pattern and test it against a node. In 9.1 there's a convenient API for this, for the first time: XPathCompiler has a method compilePattern(), which constructs a pseudo-expression whose effective boolean value is true when the context item matches the pattern. 
 
Michael Kay


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of martin.me.roberts@...
Sent: 11 July 2008 11:32
To: saxon-help@...
Subject: Re: [saxon] Performance of various xpaths

Michael,
  Thank you for your response.
 
  Do you have any API that work the other way round - i.e. given a node in a document you can tell if an XPATH matches the node?
 
Martin


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of Michael Kay
Sent: 11 July 2008 11:24
To: 'Mailing list for the SAXON XSLT and XQuery processor'
Subject: Re: [saxon] Performance of various xpaths

It's not clear from your question whether the slow performance you are reporting is coming from Saxon or from Jaxen.
 
Saxon has an optimization for XPath expressions starting //x - the first time such an expression is used for a given element name x, it scans the document to find all the x elements, and on subsequent requests, this index is reused. However, this only works for native Saxon tree models (Tiny Tree and Linked Tree), it doesn't work for external models like DOM or JDOM (though actually, it would be easy enough to implement).
 
For this and other reasons, Saxon will usually perform quite a bit faster on its native tree implementation than on third-party tree models.
 
If you have predicates in the path expressions, there are potential further gains from the join optimizer in Saxon-SA. However, if you're executing individual path expressions from a Java application there's a limit to what Saxon can achieve because it can't see the whole picture at compile time.
 
Michael Kay


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of martin.me.roberts@...
Sent: 11 July 2008 11:07
To: saxon-help@...
Subject: [saxon] Performance of various xpaths

Hi,
  I am applying a great number of xpaths to a very large document. The results are slooooow.
 
  Virtually all the xpaths are of the form //ns:parent/ns1:child  where child could be an attribute.
 
  I understand the reason for the performance hit is that the above xpath apparently requires a full document scan.
 
  With saxon xpath is this the case.  Would I get better performance using Saxon 9 over jaxen 1.1.1?  Is there a big hit because the underlying document is JDOM structure?
 
Thanks
 
Martin

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@...
https://lists.sourceforge.net/lists/listinfo/saxon-help 

Re: Performance of various xpaths

by robertsmme :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Michael,
  That looks promising, what do you mean by "the XPath expression takes the form of an XSLT pattern".  What is an XSLT Pattern?
 
  Can you use the xpath //parent/child in your reply please?
 
Martin


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of Michael Kay
Sent: 11 July 2008 11:46
To: 'Mailing list for the SAXON XSLT and XQuery processor'
Subject: Re: [saxon] Performance of various xpaths

If the XPath expression takes the form of an XSLT pattern, then you can compile it as a pattern and test it against a node. In 9.1 there's a convenient API for this, for the first time: XPathCompiler has a method compilePattern(), which constructs a pseudo-expression whose effective boolean value is true when the context item matches the pattern. 
 
Michael Kay


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of martin.me.roberts@...
Sent: 11 July 2008 11:32
To: saxon-help@...
Subject: Re: [saxon] Performance of various xpaths

Michael,
  Thank you for your response.
 
  Do you have any API that work the other way round - i.e. given a node in a document you can tell if an XPATH matches the node?
 
Martin


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of Michael Kay
Sent: 11 July 2008 11:24
To: 'Mailing list for the SAXON XSLT and XQuery processor'
Subject: Re: [saxon] Performance of various xpaths

It's not clear from your question whether the slow performance you are reporting is coming from Saxon or from Jaxen.
 
Saxon has an optimization for XPath expressions starting //x - the first time such an expression is used for a given element name x, it scans the document to find all the x elements, and on subsequent requests, this index is reused. However, this only works for native Saxon tree models (Tiny Tree and Linked Tree), it doesn't work for external models like DOM or JDOM (though actually, it would be easy enough to implement).
 
For this and other reasons, Saxon will usually perform quite a bit faster on its native tree implementation than on third-party tree models.
 
If you have predicates in the path expressions, there are potential further gains from the join optimizer in Saxon-SA. However, if you're executing individual path expressions from a Java application there's a limit to what Saxon can achieve because it can't see the whole picture at compile time.
 
Michael Kay


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of martin.me.roberts@...
Sent: 11 July 2008 11:07
To: saxon-help@...
Subject: [saxon] Performance of various xpaths

Hi,
  I am applying a great number of xpaths to a very large document. The results are slooooow.
 
  Virtually all the xpaths are of the form //ns:parent/ns1:child  where child could be an attribute.
 
  I understand the reason for the performance hit is that the above xpath apparently requires a full document scan.
 
  With saxon xpath is this the case.  Would I get better performance using Saxon 9 over jaxen 1.1.1?  Is there a big hit because the underlying document is JDOM structure?
 
Thanks
 
Martin

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@...
https://lists.sourceforge.net/lists/listinfo/saxon-help 

Re: Performance of various xpaths

by Michael Kay :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

An XSLT pattern is anything you can write in the match attribute of xsl:template. It's a subset of XPath defined at
 
 
Simple paths like //parent/child will work fine, but there are many XPath expressions that are not valid patterns, for example "2+2", or "count(//parent/child)", or "following-sibling::x".
 
Michael Kay


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of martin.me.roberts@...
Sent: 11 July 2008 11:49
To: saxon-help@...
Subject: Re: [saxon] Performance of various xpaths

Michael,
  That looks promising, what do you mean by "the XPath expression takes the form of an XSLT pattern".  What is an XSLT Pattern?
 
  Can you use the xpath //parent/child in your reply please?
 
Martin


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of Michael Kay
Sent: 11 July 2008 11:46
To: 'Mailing list for the SAXON XSLT and XQuery processor'
Subject: Re: [saxon] Performance of various xpaths

If the XPath expression takes the form of an XSLT pattern, then you can compile it as a pattern and test it against a node. In 9.1 there's a convenient API for this, for the first time: XPathCompiler has a method compilePattern(), which constructs a pseudo-expression whose effective boolean value is true when the context item matches the pattern. 
 
Michael Kay


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of martin.me.roberts@...
Sent: 11 July 2008 11:32
To: saxon-help@...
Subject: Re: [saxon] Performance of various xpaths

Michael,
  Thank you for your response.
 
  Do you have any API that work the other way round - i.e. given a node in a document you can tell if an XPATH matches the node?
 
Martin


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of Michael Kay
Sent: 11 July 2008 11:24
To: 'Mailing list for the SAXON XSLT and XQuery processor'
Subject: Re: [saxon] Performance of various xpaths

It's not clear from your question whether the slow performance you are reporting is coming from Saxon or from Jaxen.
 
Saxon has an optimization for XPath expressions starting //x - the first time such an expression is used for a given element name x, it scans the document to find all the x elements, and on subsequent requests, this index is reused. However, this only works for native Saxon tree models (Tiny Tree and Linked Tree), it doesn't work for external models like DOM or JDOM (though actually, it would be easy enough to implement).
 
For this and other reasons, Saxon will usually perform quite a bit faster on its native tree implementation than on third-party tree models.
 
If you have predicates in the path expressions, there are potential further gains from the join optimizer in Saxon-SA. However, if you're executing individual path expressions from a Java application there's a limit to what Saxon can achieve because it can't see the whole picture at compile time.
 
Michael Kay


From: saxon-help-bounces@... [mailto:saxon-help-bounces@...] On Behalf Of martin.me.roberts@...
Sent: 11 July 2008 11:07
To: saxon-help@...
Subject: [saxon] Performance of various xpaths

Hi,
  I am applying a great number of xpaths to a very large document. The results are slooooow.
 
  Virtually all the xpaths are of the form //ns:parent/ns1:child  where child could be an attribute.
 
  I understand the reason for the performance hit is that the above xpath apparently requires a full document scan.
 
  With saxon xpath is this the case.  Would I get better performance using Saxon 9 over jaxen 1.1.1?  Is there a big hit because the underlying document is JDOM structure?
 
Thanks
 
Martin

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@...
https://lists.sourceforge.net/lists/listinfo/saxon-help 

Re: Performance of various xpaths

by andrew welch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2008/7/11  <martin.me.roberts@...>:
> Michael,
>   That looks promising, what do you mean by "the XPath expression takes the
> form of an XSLT pattern".  What is an XSLT Pattern?
>
>   Can you use the xpath //parent/child in your reply please?


The thing is... don't you want all sorts of different //parent/child
combinations?  As in:

//foo/bar
//foo/baz
//other/else

etc.

So, if Saxon builds an index for each use of //, unless you use the
very same combination of parent and child you won't see much of a
benefit (or perhaps just the "parent" is indexed, so for xpaths with
same parent you would see the benefit?)

Anyway, in XSLT you could define a key that indexes the parent and child names:

<xsl:key name="parent-and-child" match="*"
use="concat(name(parent::*), '-', name(.))"/>

and then do:

key('parent-and-child', 'ns:parent-ns2:child')

instead of

//ns:parent::ns2:child

to get the nodes your after - once the key is built the access time
should be constant.



--
Andrew Welch
http://andrewjwelch.com
Kernow: http://kernowforsaxon.sf.net/

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
saxon-help mailing list archived at http://saxon.markmail.org/
saxon-help@...
https://lists.sourceforge.net/lists/listinfo/saxon-help 
LightInTheBox - Buy quality products at wholesale price