Offset Lucene Query

View: New views
4 Messages — Rating Filter:   Alert me  

Offset Lucene Query

by Jeff Busby-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm in the process of trying to implement a pagination feature for the
project I'm working on using results from Zend_Search_Lucene.  I've
discovered the setResultSetLimit method which works nicely form limiting
the results per page, but I have yet to discover a similar method for
setting the offset number for the query.  I've searched through a number
of threads on the mailing list as well as the api documentation but it
looks like this module is still a bit of a work in progress.  Is such a
feature implemented somewhere? Or is it in the works? Or is it even
possible?  I would appreciate a point in the right direction form anyone.

Cheers,

Jeff.

RE: Offset Lucene Query

by Alexander Veremyev :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Jeff,

Sorry for the delay in the answer, that was a hit time at Zend/PHP Conf
:)


Result set limiting retrieves "first N" results, but not "best N".

So first parts of two resultsets with different limitations may be
different (because of sorting by score).
For example:
---------------------
Limitation is 5:
2 6 0 3 1

Limitation is 10:
7 11 2 6 15 0 3 16 44 1
------------

So offset number can't be applied in this model.

Nevertheless, you can implement pagination using full (or limited by
some large enough number) result.
Important note: returned hits array doesn't really contain documents
data. Documents are actually loaded at access time (and it takes time).
So, you can collect IDs from result set, store them somewhere (ex. using
Zend_Cache) and then retrieve documents from the index when it's
necessary:
-----------------
$hits = $index->find($query);
$result = array();
foreach ($hits as $hit) {
    $result[] = array($hit->id , $hit->score);
}

// store $result somewhere
....
-----------
...
// load $result
...
for ($pos = $start; $pos < $end; $pos++) {
    $doc = $index->getDocument($result[$pos][0]);
    $score = $result[$pos][0];
    ...
    $title = $doc->title;
    ...
}
-------------

With best regards,
   Alexander Veremyev.

-----Original Message-----
From: Jeff Busby [mailto:jeff@...]
Sent: Wednesday, October 10, 2007 12:24 PM
To: fw-formats@...
Subject: [fw-formats] Offset Lucene Query

I'm in the process of trying to implement a pagination feature for the
project I'm working on using results from Zend_Search_Lucene.  I've
discovered the setResultSetLimit method which works nicely form limiting

the results per page, but I have yet to discover a similar method for
setting the offset number for the query.  I've searched through a number

of threads on the mailing list as well as the api documentation but it
looks like this module is still a bit of a work in progress.  Is such a
feature implemented somewhere? Or is it in the works? Or is it even
possible?  I would appreciate a point in the right direction form
anyone.

Cheers,

Jeff.


Re: Offset Lucene Query

by Jeff Busby-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks for the info Alex, that's pretty much what I ended up doing.  If
I get some time I'll post exactly what I did for others to use and critique.

Cheers,

Jeff Busby.

Alexander Veremyev wrote:

> Hi Jeff,
>
> Sorry for the delay in the answer, that was a hit time at Zend/PHP Conf
> :)
>
>
> Result set limiting retrieves "first N" results, but not "best N".
>
> So first parts of two resultsets with different limitations may be
> different (because of sorting by score).
> For example:
> ---------------------
> Limitation is 5:
> 2 6 0 3 1
>
> Limitation is 10:
> 7 11 2 6 15 0 3 16 44 1
> ------------
>
> So offset number can't be applied in this model.
>
> Nevertheless, you can implement pagination using full (or limited by
> some large enough number) result.
> Important note: returned hits array doesn't really contain documents
> data. Documents are actually loaded at access time (and it takes time).
> So, you can collect IDs from result set, store them somewhere (ex. using
> Zend_Cache) and then retrieve documents from the index when it's
> necessary:
> -----------------
> $hits = $index->find($query);
> $result = array();
> foreach ($hits as $hit) {
>     $result[] = array($hit->id , $hit->score);
> }
>
> // store $result somewhere
> ....
> -----------
> ...
> // load $result
> ...
> for ($pos = $start; $pos < $end; $pos++) {
>     $doc = $index->getDocument($result[$pos][0]);
>     $score = $result[$pos][0];
>     ...
>     $title = $doc->title;
>     ...
> }
> -------------
>
> With best regards,
>    Alexander Veremyev.
>
> -----Original Message-----
> From: Jeff Busby [mailto:jeff@...]
> Sent: Wednesday, October 10, 2007 12:24 PM
> To: fw-formats@...
> Subject: [fw-formats] Offset Lucene Query
>
> I'm in the process of trying to implement a pagination feature for the
> project I'm working on using results from Zend_Search_Lucene.  I've
> discovered the setResultSetLimit method which works nicely form limiting
>
> the results per page, but I have yet to discover a similar method for
> setting the offset number for the query.  I've searched through a number
>
> of threads on the mailing list as well as the api documentation but it
> looks like this module is still a bit of a work in progress.  Is such a
> feature implemented somewhere? Or is it in the works? Or is it even
> possible?  I would appreciate a point in the right direction form
> anyone.
>
> Cheers,
>
> Jeff.
>
>
>  

RE: Offset Lucene Query

by Ahmed Shaikh Memon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Alex,

If we put indexes into Zend_Cache, it will be for per-user-basis, would not this method disk-space-intensive?

Any alternate to this?

regards,

-Ahmed.

Alexander Veremyev wrote:
Hi Jeff,

Sorry for the delay in the answer, that was a hit time at Zend/PHP Conf
:)


Result set limiting retrieves "first N" results, but not "best N".

So first parts of two resultsets with different limitations may be
different (because of sorting by score).
For example:
---------------------
Limitation is 5:
2 6 0 3 1

Limitation is 10:
7 11 2 6 15 0 3 16 44 1
------------

So offset number can't be applied in this model.

Nevertheless, you can implement pagination using full (or limited by
some large enough number) result.
Important note: returned hits array doesn't really contain documents
data. Documents are actually loaded at access time (and it takes time).
So, you can collect IDs from result set, store them somewhere (ex. using
Zend_Cache) and then retrieve documents from the index when it's
necessary:
-----------------
$hits = $index->find($query);
$result = array();
foreach ($hits as $hit) {
    $result[] = array($hit->id , $hit->score);
}

// store $result somewhere
....
-----------
...
// load $result
...
for ($pos = $start; $pos < $end; $pos++) {
    $doc = $index->getDocument($result[$pos][0]);
    $score = $result[$pos][0];
    ...
    $title = $doc->title;
    ...
}
-------------

With best regards,
   Alexander Veremyev.

-----Original Message-----
From: Jeff Busby [mailto:jeff@jeffbusby.ca]
Sent: Wednesday, October 10, 2007 12:24 PM
To: fw-formats@lists.zend.com
Subject: [fw-formats] Offset Lucene Query

I'm in the process of trying to implement a pagination feature for the
project I'm working on using results from Zend_Search_Lucene.  I've
discovered the setResultSetLimit method which works nicely form limiting

the results per page, but I have yet to discover a similar method for
setting the offset number for the query.  I've searched through a number

of threads on the mailing list as well as the api documentation but it
looks like this module is still a bit of a work in progress.  Is such a
feature implemented somewhere? Or is it in the works? Or is it even
possible?  I would appreciate a point in the right direction form
anyone.

Cheers,

Jeff.