Hi Ralf,
On 10/10/2008 at 10:57 AM, Kraus, Ralf | pixelhouse GmbH wrote:
> I am trying to solve the typical german "Donaudampfschiff"-
> problem by using the DictionaryCompoundWordTokenFilter ...
> Anyone can show me how to configure my schema.xml to use the
> DictionaryCompoundWordTokenFilterFactory ???
Minimally, add the following inside the <analyzer> section for your field type:
<filter class="solr.DictionaryCompoundWordTokenFilterFactory"
dictFile="/path/to/your/dictionary" />
You can also add the following (optional) attributes:
- "minWordSize" (default: 5)
- "minSubwordSize" (default: 2)
- "maxSubwordSize" (default: 15)
- "onlyLongestMatch" (default: true)
FYI, the compound package summary in the nightly trunk Lucene contrib javadocs has some useful information:
<
http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/contrib-analyzers/org/apache/lucene/analysis/compound/package-summary.html>
Steve