[TYPO3-Solr] Find word parts

Jigal van Hemert jigal at xs4all.nl
Fri Oct 21 19:57:10 CEST 2011


Hi,

On 21-10-2011 18:42, Rik Willems wrote:
>> simply search for table*
>> This works with Solr 3.x
>
> This works perfect. But, I'd like the visitor not to do and see this. I
> want this behaviour by default. Is that possible without the
> 'complicated stuff' in the rest of this thread?

The 'complicated stuff' in the rest of this thread is about a feature 
which I think is rather necessary for languages such as German and 
Dutch: it makes it possible to index the parts of compound words. Compare:
Dutch: rechtsbijstandverzekeringsmaatschappijen
German: Rechtsschutzversicherungsgesellschaften
English: legal protection insurance companies

With support for compound words a search for verzekering / Versicherung 
/ insurance would come up with the term above. These are pretty long 
examples, but in quite a few languages words are 'glued' together.

To index the parts of a compound word solr needs a list of words which 
can form a compound word. Word list like those from OpenTaal [1] (a free 
Dutch list used for spell checkers in OpenOffice.org, Firefox, 
Thunderbird, etc.) are not directly usable because they contain compound 
words.

[1] http://www.opentaal.org

-- 
Kind regards / met vriendelijke groet,

Jigal van Hemert.


More information about the TYPO3-project-solr mailing list