[TYPO3-Solr] Problem: the same page appears twice in search results
Dmitry Dulepov
dmitry.dulepov at gmail.com
Mon Mar 4 10:11:52 CET 2013
Hi!
Irene Eglin wrote:
> Olivier is right - we would not like to have indexed a page for exact
> combinations of user groups. It was something we didn't like at all with
> the old Sitesearch.
>
> This is, because we want users to get the content of a page if they are
> in ANY of the given user groups - not in all.
That is unnecessarily broad. EXT:solr's current approach gets more data in
the index than necessary and some of it is useless. EXT:solr will add
combinations that are never possible + create duplicate results.
In the proper case Solr should take the exact list of current user's groups
and use that as a filter when searching. That will always give you a single
proper document for a single url (no duplicates!). This is how Kasper did
it in indexed_search+crawler and it works correctly there. You may even
avoid the necessity to have an access filter (a Java component) in Solr server.
The only disadvantage of this approach is that it is necessary to reindex
pages if a user has a new combination of groups. That should not happen
under proper site usage design but it is possible to foresee such issue and
mark pages for reindexing in the tcemain hook.
Just my 2 cents :)
--
Dmitry Dulepov
TYPO3 CMS core & security teams member
Love gorillas.
More information about the TYPO3-project-solr
mailing list