[TYPO3-Solr] Solr synonyms auto suggest

Stephan Schuler Stephan.Schuler at netlogix.de
Tue Sep 25 10:49:38 CEST 2012


Hey there.


I'm facing the same problem currently.

I don't have synonyms that are really different words
with the same meaning, like "tomatoes, apples, cherry".
I have some (very view) product names that have accents
and umlauts, e.g. "ä" and "ë" that might not be on the
users keyboard.

But using "ae" and and "ee" instead (which can be done
by the solr on indexing time with a
MappingCharFilterFactory) isn't exactly what my
customer wants, since it's the product name that
should not be changed in any case.
Unfortunately, these umlauts usually are the first or
second letter, so it's very likely for the user not to
be able to auto-complete those product names.

Currently I have two steps.
1 On indexing time, I use a MappingCharFilterFactory
  that replaces those letters as described, e.g. "ä"
  gets "ae", "ë" gets "ee", etc.
2 I extended the suggest eID to interact with it.
  2 a) My extended suggest eID changes the suggest
       query the same way the MappingCharFilterFactory
       does at indexing time, so even if the user types
       "ä", the solr gets asked for "ae"
  2 b) I have a static list of product names ant their
       charFiltered equivalents. The eID now uses this
       list to reconstruct the original product name out
       of whatever the solr suggestion responds.

This is obviously a really ugly solution and only works
for expressions I know (and have to service) besides the
solr index content.

It could be adjusted a bit, to make the MappingCharFilterFactory
replace "ä" by something like "___umlaut_ae___" to kind
of "protected" the umlauts. This would allow the eID
reconstruction to operate without a reverse filter list
for words. But I haven't tried this because my current
solution ... "works for the moment" :), and both aren't
exactly the way I want it to work.


I know that my approach doesn't work for you, Soujanya.

If you have synonyms like you described
("fun, entertainment => recreation"), it makes a
difference if you search for "fun" of for "recreation".
Searching for "fun" results in all objects containing
"fun" or "recreation".
But searching for "recreation" results only in objects
containing "recreation", not necessarily "fun".

So doing the TYPO3 side replacement doesn't fit your
needs.

You could create a synonym rule for solr indexing like
this:

fun => recreation, fun__isasynonymfor__recreation
entertainment => recreation, recreation__isasynonymfor__recreation

This would allow you to skip the ".*__isasynonymfor__"
by an extended eID.

Might work. Try it and answer what happened :).


I would be glad to discuss this a little further, since
I do see a use case for this, even if it's not what
usually is done with suggestion.


Regards,
Stephan.



Stephan Schuler
Web-Entwickler

Telefon: +49 (911) 539909 - 0
E-Mail: Stephan.Schuler at netlogix.de
Website: media.netlogix.de


--
netlogix GmbH & Co. KG
IT-Services | IT-Training | Media
Andernacher Straße 53 | 90411 Nürnberg
Telefon: +49 (911) 539909 - 0 | Fax: +49 (911) 539909 - 99
E-Mail: info at netlogix.de | Internet: http://www.netlogix.de

netlogix GmbH & Co. KG ist eingetragen am Amtsgericht Nürnberg (HRA 13338)
Persönlich haftende Gesellschafterin: netlogix Verwaltungs GmbH (HRB 20634)
Umsatzsteuer-Identifikationsnummer: DE 233472254
Geschäftsführer: Stefan Buchta, Matthias Schmidt



-----Ursprüngliche Nachricht-----
Von: typo3-project-solr-bounces at lists.typo3.org [mailto:typo3-project-solr-bounces at lists.typo3.org] Im Auftrag von Jigal van Hemert
Gesendet: Dienstag, 25. September 2012 09:48
An: typo3-project-solr at lists.typo3.org
Betreff: Re: [TYPO3-Solr] Solr synonyms auto suggest

Hi,

On 25-9-2012 7:19, Soujanya Kinnera wrote:
> i,e I need the synonyms in the suggesions ..

As Olivier already tried to explain, the feature is called autocompletion. That means that solr is trying to *complete* the word you started typing with words found in the search index.

If you type "rec" you will receive a list with words like "record"
"recess" "recreation", because these all start with "rec". Your synonyms do not start with these letters and will not be in the list.

--
Jigal van Hemert
TYPO3 Core Team member

TYPO3 .... inspiring people to share!
Get involved: typo3.org
_______________________________________________
TYPO3-project-solr mailing list
TYPO3-project-solr at lists.typo3.org
http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-project-solr


More information about the TYPO3-project-solr mailing list