[TYPO3-Solr] Problem: the same page appears twice in search results

Dmitry Dulepov dmitry.dulepov at gmail.com
Mon Mar 4 10:11:52 CET 2013


Hi!

Irene Eglin wrote:
> Olivier is right - we would not like to have indexed a page for exact
> combinations of user groups. It was something we didn't like at all with
> the old Sitesearch.
>
> This is, because we want users to get the content of a page if they are
> in ANY of the given user groups - not in all.

That is unnecessarily broad. EXT:solr's current approach gets more data in 
the index than necessary and some of it is useless. EXT:solr will add 
combinations that are never possible + create duplicate results.

In the proper case Solr should take the exact list of current user's groups 
and use that as a filter when searching. That will always give you a single 
proper document for a single url (no duplicates!). This is how Kasper did 
it in indexed_search+crawler and it works correctly there. You may even 
avoid the necessity to have an access filter (a Java component) in Solr server.

The only disadvantage of this approach is that it is necessary to reindex 
pages if a user has a new combination of groups. That should not happen 
under proper site usage design but it is possible to foresee such issue and 
mark pages for reindexing in the tcemain hook.

Just my 2 cents :)

-- 
Dmitry Dulepov
TYPO3 CMS core & security teams member

Love gorillas.


More information about the TYPO3-project-solr mailing list