[TYPO3-Solr] Stemming and config

Rik Willems rik at metmeer.nl
Wed Jun 12 13:41:21 CEST 2013


Op 12-06-13 11:58, Jigal van Hemert schreef:
> Hi,
>
> On 12-6-2013 8:54, Rik Willems wrote:
>> In typo3cores/conf/dutch/dutch-common-nouns.txt I have the following:
>> reiskosten
>> reiskostenaftrek
>> reiskostenforfait
>> reiskostenregeling
>> reiskostenvergoeding
>
> The whole words should already be tokenized by the
> StandardSolrTokenizer. Normally I would expect the real sub-words in
> your list:
> reis
> kosten
> reiskosten
> aftrek
> forfait
> regeling
> vergoeding
>
>> <!-- split subwords dutch nouns -->
>> <filter class="solr.DictionaryCompoundWordTokenFilterFactory"
>> dictionary="dutch/dutch-common-nouns.txt"
>> minWordSize="5" minSubwordSize="4" maxSubwordSize="15"
>> onlyLongestMatch="true"/>
>
> onlyLongestMatch would match with "reiskostenvergoeding" (which is in
> the dictionary) and none of the subwords would be included in the index
> (as far as I understood this filter factory).
>


Hi Jigal,

None of my changes/tries result in a change in the search results. Until 
now I used the standard TYPO3 Solr schema.xml and added these changes. 
Is this the correct place to do this?

Should I restart Tomcat after changes in the schema.xml? Doing this 
resulted in nothing by the way, but it is good to know.

Cheers!




More information about the TYPO3-project-solr mailing list