[TYPO3-Solr] Find word parts
Jigal van Hemert
jigal at xs4all.nl
Fri Oct 21 19:57:10 CEST 2011
Hi,
On 21-10-2011 18:42, Rik Willems wrote:
>> simply search for table*
>> This works with Solr 3.x
>
> This works perfect. But, I'd like the visitor not to do and see this. I
> want this behaviour by default. Is that possible without the
> 'complicated stuff' in the rest of this thread?
The 'complicated stuff' in the rest of this thread is about a feature
which I think is rather necessary for languages such as German and
Dutch: it makes it possible to index the parts of compound words. Compare:
Dutch: rechtsbijstandverzekeringsmaatschappijen
German: Rechtsschutzversicherungsgesellschaften
English: legal protection insurance companies
With support for compound words a search for verzekering / Versicherung
/ insurance would come up with the term above. These are pretty long
examples, but in quite a few languages words are 'glued' together.
To index the parts of a compound word solr needs a list of words which
can form a compound word. Word list like those from OpenTaal [1] (a free
Dutch list used for spell checkers in OpenOffice.org, Firefox,
Thunderbird, etc.) are not directly usable because they contain compound
words.
[1] http://www.opentaal.org
--
Kind regards / met vriendelijke groet,
Jigal van Hemert.
More information about the TYPO3-project-solr
mailing list