|
View:
New views
11 Messages
—
Rating Filter:
Alert me
|
|
|
[ANN] Searchable Plugin 0.4.1 releasedGrails Searchable Plugin 0.4.1 is released!
""" The Searchable Plugin aims to provide rich search features to Grails applications with minimum effort, and still give you power and flexibility when you need it. It is built on the fantastic Compass Search Engine Framework and Lucene and has the same license as Grails (Apache 2). """ This is a maintenance release that fixes a few bugs - see JIRA for details: http://jira.codehaus.org/secure/ReleaseNote.jspa?projectId=11450&styleName=Html&version=14142 It bundles a patched version of Compass 1.2.2 specifically for GRAILSPLUGINS-254. I will give the code to the Compass project and hope they will include it in future versions of Compass. The next version of the plugin will be 0.5 and means upgrading to Compass 2.0 which has some excellent new features and improvements that we can take advantage of. Compass 2.0 is natively Java 5 only, with a retroweaver version for 1.4 jvms. I reckon I will do the same thing and provide two versions of the plugin - Java 5 by default and a separate plugin called "searchable-jdk-14". Thanks to everybody who has said nice things, raised issues, answered mailing lists posts and blogged about the plugin :-) Cheers, Maurice |
|
|
Re: [ANN] Searchable Plugin 0.4.1 releasedWicked! Thanks dude, 272 fixes an issue that came up last week, one that
I was ignoring and hoping would go away magically and it has! :) Maurice Nicholson wrote: > Grails Searchable Plugin 0.4.1 is released! > > """ > The Searchable Plugin aims to provide rich search features to Grails > applications with minimum effort, and still give you power and > flexibility when you need it. > > It is built on the fantastic Compass Search Engine Framework and > Lucene and has the same license as Grails (Apache 2). > """ > > This is a maintenance release that fixes a few bugs - see JIRA for > details: > http://jira.codehaus.org/secure/ReleaseNote.jspa?projectId=11450&styleName=Html&version=14142 > <http://jira.codehaus.org/secure/ReleaseNote.jspa?projectId=11450&styleName=Html&version=14142> > > > It bundles a patched version of Compass 1.2.2 specifically for > GRAILSPLUGINS-254. I will give the code to the Compass project and > hope they will include it in future versions of Compass. > > The next version of the plugin will be 0.5 and means upgrading to > Compass 2.0 which has some excellent new features and improvements > that we can take advantage of. > > Compass 2.0 is natively Java 5 only, with a retroweaver version for > 1.4 jvms. I reckon I will do the same thing and provide two versions > of the plugin - Java 5 by default and a separate plugin called > "searchable-jdk-14". > > Thanks to everybody who has said nice things, raised issues, answered > mailing lists posts and blogged about the plugin :-) > > Cheers, > Maurice > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: [ANN] Searchable Plugin 0.4.1 releasedThanks a bunch, Seachable is the best plugins for Grails!
|
|
|
Re: [ANN] Searchable Plugin 0.4.1 releasedgreat work! thanks
2008/4/17, Seymour Cakes <seymores@...>: Thanks a bunch, Seachable is the best plugins for Grails! -- 爱生活,爱FOX |
|
|
Re: [ANN] Searchable Plugin 0.4.1 releasedA couple of questions I didn't see in the documentation: is writing to the index thread safe? (I'd imagine the answer must be yes) and what strategies do you use for using this in a clustered environment? Put the index on a shared drive?
Thanks Dustin On Thu, Apr 17, 2008 at 10:27 AM, Fox Woo <foxwu718@...> wrote: great work! thanks |
|
|
Re: [ANN] Searchable Plugin 0.4.1 releasedTake a look at compass doc
http://www.compass-project.org/docs/1.2.2/reference/html/core-connection.html Maybe for cluster, the best approach is a jdbc connection On 17/04/2008, Dustin Whitney <dustin.whitney@...> wrote: > A couple of questions I didn't see in the documentation: is writing to the > index thread safe? (I'd imagine the answer must be yes) and what > strategies do you use for using this in a clustered environment? Put the > index on a shared drive? > > Thanks > Dustin > > > On Thu, Apr 17, 2008 at 10:27 AM, Fox Woo <foxwu718@...> wrote: > > great work! thanks > > > > > > 2008/4/17, Seymour Cakes <seymores@...>: > > > > > Thanks a bunch, Seachable is the best plugins for Grails! > > > > > > > > > > > > > > -- > > 爱生活,爱FOX > > |
|
|
Re: [ANN] Searchable Plugin 0.4.1 releasedHi Maurice.
I'm glad to tell you that the two main bugs I had with previous versions, now seem to be fixed! (the NPE when cascade-saving, and some other errors with component references). The other thing was about termFreq, which I thought it had a bug. Maybe it's not a bug after all, but some misunderstanding on my part, or it's not clearly explained in the docs, or it's a bug :-) Let me explain. When calling SomeClass.termFreqs('someTerm'), I thought the resulting number would be "the number of occurrences of someTerm within instances SomeClass". For example, if I had: (new Album(title:'yeah yeah yeah')).save() (new Album(title:'Just say yeah')).save() and then I query: Album.termFreqs('yeah'), I would get a count of 4 However, what I seem to be getting is "the number of Album instances that have the term 'yeah' in any of their indexable properties". So... maybe this is the intended behaviour... maybe not... in any case, I think that 1) it should be more explicitly explained, 2) a *real* term frequencies, with respect to terms should be added. Like for example (completely made up example), if I had a Book class, which hasMany Paragraph, and 'm storing the text in the Paragraph. And I'm doing some text analysis, wanting to know how many times a certain term appears in the Book. I don't want the count of paragraphs that contain that word, I want the actual number of occurrences of that word. Thinking a little more, maybe this behaviour is a "feature" of Compass? BarZ --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Re: [ANN] Searchable Plugin 0.4.1 releasedHey Barz,
term frequences (SomeClass.termFreqs) gives you a list of term+frequency pairs, in other words, a list of terms and their respective frequencies. Personally I think the documentation is pretty clear: http://grails.org/Searchable+Plugin+-+Searching#SearchablePlugin-Searching-termFreqs and the here's the first example from that section: // print all Book term frequencies Anyway, I think the feature you describe makes sense, but it can be achieved now, if not especially optimised, by simply hunting for the term in the term-freqs, eg: Book.termFreqs.find { it.term == 'marmalade' }.freqs The information exists in the index, so it could be exposed in a simpler fashion, but is it required? Term-freqs are an advanced topic (IMHO) and I wonder how many people will use this feature? The other point is that the term frequency currently provided is the frequency of a term over the whole index, not just a single Book instance! Again the information is in the index on a per Lucene document (Book instance) basis, it's just a question of exposing it. As you said these are features that make sense in Compass itself so I think they are questions for the Compass forum. Cheers, Maurice On 21/04/2008, Barzilai Spinak <barcho@...> wrote: Hi Maurice. |
|
|
Re: Re: [ANN] Searchable Plugin 0.4.1 releasedHey Barz, term frequences (SomeClass.termFreqs) gives you a list of term+frequency pairs, in other words, a list of terms and their respective frequencies. Personally I think the documentation is pretty clear: http://grails.org/Searchable+Plugin+-+Searching#SearchablePlugin-Searching-termFreqs and the here's the first example from that section: // print all Book term frequencies Anyway, I think the feature you describe makes sense, but it can be achieved now, if not especially optimised, by simply hunting for the term in the term-freqs, eg: Book.termFreqs.find { it.term == 'marmalade' }.freqs The information exists in the index, so it could be exposed in a simpler fashion, but is it required? Term-freqs are an advanced topic (IMHO) and I wonder how many people will use this feature? The other point is that the term frequency currently provided is the frequency of a term over the whole index, not just a single Book instance! Again the information is in the index on a per Lucene document (Book instance) basis, it's just a question of exposing it. As you said these are features that make sense in Compass itself so I think they are questions for the Compass forum. Cheers, Maurice On 21/04/2008, Barzilai Spinak <barcho@...> wrote: Hi Maurice. |
|
|
Re: Re: [ANN] Searchable Plugin 0.4.1 releasedIt got through the first time :-)
Maurice Nicholson wrote: > [Reposting this because it hasn't turned up on Nabble 15 hours later - > problems with Nabble?] > > Hey Barz, > > term frequences (SomeClass.termFreqs) gives you a list of > term+frequency pairs, in other words, a list of terms and their > respective frequencies. > > Personally I think the documentation is pretty clear: > > http://grails.org/Searchable+Plugin+-+Searching#SearchablePlugin-Searching-termFreqs I think we are using a different definition of term frequency. To me it's the number of occurrences of the *term*. However, the termFreqs method is returning the number of *documents* (instances of domain classes, in Grails) where the term occurs, disregarding the occurrences of the term itself. Let me simplify my previous example. Let's imagine there's a single instance of Paragraph in our DB/index: p= new Paragraph(text: "Hello John, my name is John and this is my friend John") p.save() Now, according to my definition, the frequency of the term "John" is 3 According to Paragraph.termFreqs(), it's 1 (because there's only one domain object where the term John appears, disregarding the fact that it appears 3 times) Of course, when searching, a Paragraph object/document where the term "John" appears three times will rank higher than a Paragraph where it appears only once. So, of course, this information is stored somewhere in the index. (Last night I spent some time working with Luke which is amazing and fun :-) ) I'm not in immediate need of this feature, I'm just being picky while I learn :-) I'll dig around Compass and Lucene a little more and see what I can find. On a completely unrelated, but more important note: How would you describe the query performance of Compass/Lucene versus searching in the relational database using normal GORM/HSQL? BarZ > Anyway, I think the feature you describe makes sense, but it can be > achieved now, if not especially optimised, by simply hunting for the > term in the term-freqs, eg: > > Book.termFreqs.find { it.term == 'marmalade' }.freqs > > The information exists in the index, so it could be exposed in a > simpler fashion, but is it required? Term-freqs are an advanced topic > (IMHO) and I wonder how many people will use this feature? > > The other point is that the term frequency currently provided is the > frequency of a term over the whole index, not just a single Book > instance! Again the information is in the index on a per Lucene > document (Book instance) basis, it's just a question of exposing it. > > As you said these are features that make sense in Compass itself so I > think they are questions for the Compass forum. > > Cheers, > Maurice > > On 21/04/2008, *Barzilai Spinak* <barcho@... > <mailto:barcho@...>> wrote: > > Hi Maurice. > I'm glad to tell you that the two main bugs I had with previous > versions, now seem to be fixed! (the NPE when cascade-saving, and some > other errors with component references). > > The other thing was about termFreq, which I thought it had a bug. > Maybe > it's not a bug after all, but some misunderstanding on my part, or > it's > not clearly explained in the docs, or it's a bug :-) > > Let me explain. > > When calling SomeClass.termFreqs('someTerm'), I thought the resulting > number would be "the number of occurrences of someTerm within > instances > SomeClass". > > For example, if I had: > (new Album(title:'yeah yeah yeah')).save() > (new Album(title:'Just say yeah')).save() > > and then I query: Album.termFreqs('yeah'), I would get a count of 4 > However, what I seem to be getting is "the number of Album instances > that have the term 'yeah' in any of their indexable properties". > > So... maybe this is the intended behaviour... maybe not... in any > case, > I think that 1) it should be more explicitly explained, 2) a > *real* term > frequencies, with respect to terms should be added. > Like for example (completely made up example), if I had a Book class, > which hasMany Paragraph, and 'm storing the text in the Paragraph. And > I'm doing some text analysis, wanting to know how many times a certain > term appears in the Book. I don't want the count of paragraphs that > contain that word, I want the actual number of occurrences of that > word. > > Thinking a little more, maybe this behaviour is a "feature" of > Compass? > > > BarZ > > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > > > --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
|
|
Re: Re: [ANN] Searchable Plugin 0.4.1 releasedOn 4/22/08 5:05 PM, "Barzilai Spinak" <barcho@...> wrote: > I think we are using a different definition of term frequency. To me > it's the number of occurrences of the *term*. However, the termFreqs > method is returning the number of *documents* (instances of domain > classes, in Grails) where the term occurs, disregarding the occurrences > of the term itself. Your definition is relatively natural, but it against common practice in text retrieval. Experiments on retrieval performance have generally borne out the value of the "document count" definition over the "word count" definition that you suggest. This probably has much to do with the average size of the documents under test interacting with the fact that you want to weight terms based on the prevailing frequency without much contribution from documents that are particularly related to the term. > Of course, when searching, a Paragraph object/document where the term > "John" appears three times will rank higher than a Paragraph where it > appears only once. So, of course, this information is stored somewhere > in the index. Only indirectly. There is a per term weight vector stored on each document, but the weights don't only depend on the number of occurrences of that term. The details vary depending on how you index the document. Some details are available in the javadoc for Lucene's Similarity function. > On a completely unrelated, but more important note: > How would you describe the query performance of Compass/Lucene versus > searching in the relational database using normal GORM/HSQL? For what it does, it is vastly faster. If you want semi-structured data, take Lucene. If you want the best few elements of a ranked list (ranked according to a Lucene computable score), choose Lucene. If you want joins, aggregates and referential integrity pick the RDBMS. --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email |
| Free Forum Powered by Nabble | Forum Help |