|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
Re: Need Saxon-SA? (was: Re: saxon-help Digest, Vol 26, Issue 18)Thanks a lot for your help, now the program runs. ;-) I decided at the same
time to switch to the s9api which is simpler to use. However, I have trouble achieving streaming at the source. I created a custom URIResolver which traps calls to doc('anything') and sends a StreamSource taken from somewhere else in my code. I then wrote the following: Processor saxp = new Processor(true); XQueryCompiler xqp = saxp.newXQueryCompiler(); XQueryExecutable xqex = xqp.compile(my_query_string); XQueryEvaluator xqe = xqex.load(); xqe.setURIResolver(my_uri_resolver); From there, I tried two things: 1. Iterate over query results with: Iterator it = xqe.iterator(); while (it.hasNext()) { Do something } 2. Create a custom Destination and Receiver to trap output events, with: xqe.run(new MyDestination(new MyReceiver())); However, when I trace into the code with a query such as "for $x in (#saxon:stream#) {doc('0')/a/b[1]} return $x", I remark the following: - In situation #1, the whole source is read on the first call to it.hasNext(). - In situation #2, the whole source is read before any event is called on the instance of MyReceiver. Clearly the query result cannot change once the closing tag of the first "b" has been read; therefore there should be no need to read the rest of the document, let alone to postpone sending the output until the end of the file. Yet the query "for $x in (#saxon:stream#) {doc('0')/a/b} return $x", which returns all b elements, gives me the desired behavior: it sends them one by one to the output as the input is being read. I don't understand why the processor behaves differently for the two queries. After some experiments, I discovered though that as soon as the "in" clause includes sibling functions ([1], following-sibling, etc.) or unions of two path expressions, or if the return clause applies further operations on the result (such as $x/b above), the whole document is read prior to outputting the results. Is there a way to change my code, or did I reach a limit of Saxon's streaming capabilities? Thanks, Sylvain -----Original Message----- Date: Thu, 17 Jul 2008 08:39:05 +0100 From: "Michael Kay" <mike@...> Subject: Re: [saxon] Need Saxon-SA? (was: Re: saxon-help Digest, Vol 26, Issue 18) To: "'Mailing list for the SAXON XSLT and XQuery processor'" <saxon-help@...> Message-ID: <BCF9FA33CC214C0393FBE8780C111343@Sealion> Content-Type: text/plain; charset="iso-8859-1" If you're invoking Saxon from the command line, you need to use the -sa option. If you're invoking it from a Java API, you need to specify that you want the SA processor when you start: JAXP: instantiate com.saxonica.SchemaAwareTransformerFactory native Saxon API: instantiate SchemaAwareConfiguration s9api: new Processor(true). Sorry about the inconvenience! Michael Kay Saxonica > > -----Original Message----- > > From: saxon-help-bounces@... > > [mailto:saxon-help-bounces@...] On Behalf > > Of Sylvain Hall? > > Sent: 17 July 2008 00:18 > > To: saxon-help@... > > Subject: [saxon] Need Saxon-SA? (was: Re: saxon-help Digest, > > Vol 26, Issue 18) > > > > Thanks Michael. I wrote a URIResolver which passes whatever > > I want to the engine. > > > > I downloaded an evaluation copy of Saxon-SA and a licence key > > in order to test the (#saxon:stream#) functionality; I > > deleted my Saxon-B jars, replaced them with those from > > saxonsa9-1-0-1j.zip and copied the licence file into my > > classpath and restarted the whole thing to make sure the > > changes would be noticed. However I get the following when I > > run my test program: > > > > XPST0003: XQuery syntax error in #for $x in (#saxon:stream#) {#: > > To use saxon:stream, you need the Saxon-SA processor > > from http://www.saxonica.com/ > > > > Besides, whether the licence file is present in the classpath > > or not does not change the message. Shouldn't I be told > > "License file saxon-license.lic not found" if I remove the > > licence from the classpath? > > > > I must have missed something obvious; does anyone know what > > that can be? > > > > Sylvain > > > > --- Original message --- > > > > Date: Fri, 11 Jul 2008 18:12:30 +0100 > > From: "Michael Kay" <mike@...> > > Subject: Re: [saxon] saxon-help Digest, Vol 26, Issue 18 > > To: "'Mailing list for the SAXON XSLT and XQuery processor'" > > <saxon-help@...> > > Message-ID: <9AA132C547FF406881B5D680FF587985@Sealion> > > Content-Type: text/plain; charset="us-ascii" > > > > > > Thanks. However, I noticed in the documentation that > > this > > streaming facility is available for an expression > > that must > > start with doc() (i.e. it must read a file). > > If I set my > > document context to another source (e.g. a > > character stream > > produced by another part of my code) > > using the bindDocument() > > method, is there a way to > > achieve the same result? > > > > You can write a URIResolver that intercepts the call on doc() > > and returns a StreamSource. But I don't think it can be made > > to work with the XQJ > > bindDocument() method. > > > > Michael Kay > > http://www.saxonica.com/ > > ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ saxon-help mailing list archived at http://saxon.markmail.org/ saxon-help@... https://lists.sourceforge.net/lists/listinfo/saxon-help |
|
|
Re: Need Saxon-SA? (was: Re: saxon-help Digest, Vol 26, Issue 18)> However, when I trace into the code with a query such as "for $x in
> (#saxon:stream#) {doc('0')/a/b[1]} return $x", I remark the following: > > - In situation #1, the whole source is read on the first call > to it.hasNext(). > - In situation #2, the whole source is read before any event > is called on the instance of MyReceiver. > > Clearly the query result cannot change once the closing tag > of the first "b" > has been read; therefore there should be no need to read the > rest of the document, let alone to postpone sending the > output until the end of the file. There are many path expressions that could theoretically be streamed, but which Saxon does not in fact stream. For streaming to work, you must use an expression that follows the rules in http://www.saxonica.com/documentation/sourcedocs/serial/streamability.html Note in particular that this does not allow positional predicates. You should be able to rewrite the expression to get around this restriction: try ((#saxon:stream#) {doc('0')/a/b})[1]" > > Yet the query "for $x in (#saxon:stream#) {doc('0')/a/b} > return $x", which returns all b elements, gives me the > desired behavior: it sends them one by one to the output as > the input is being read. I don't understand why the processor > behaves differently for the two queries. > > After some experiments, I discovered though that as soon as > the "in" clause includes sibling functions ([1], > following-sibling, etc.) or unions of two path expressions, > or if the return clause applies further operations on the > result (such as $x/b above), the whole document is read prior > to outputting the results. > page referenced above. As to "why", it's simply a question of time and effort for doing the optimizations and testing them. I started with the streaming subset of XPath defined in XML Schema, and then added a few extra capabilities like union paths and simple boolean predicates. One of the reasons that positional predicates aren't currently supported is that the streaming evaluator currently has to make a decision whether to include a node in the result or not based solely on knowledge of the names of the ancestors of the node and the values of its attributes; it doesn't retain any memory about preceding siblings of any of those nodes. Also, you shouldn't need to "trace into the code" to see whether streaming is being used. Use the -explain option on the command line, or processor.setConfigurationProperty(FeatureKeys.TRACE_OPTIMIZER_DECISIONS, Boolean.TRUE) from the Java API. Michael Kay http://www.saxonica.com/ ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ saxon-help mailing list archived at http://saxon.markmail.org/ saxon-help@... https://lists.sourceforge.net/lists/listinfo/saxon-help |
|
|
|
| Free Forum Powered by Nabble | Forum Help |