[TYPO3-Solr] Problem: the same page appears twice in search results

Stephan Schuler Stephan.Schuler at netlogix.de
Mon Mar 4 13:00:06 CET 2013


Hey there.


Is there any sollution that can be implemented easily, maybe ontop of the current solr extension?
I just think about such a thing:

accessGroupFilterFunction: knows which fe_groups are relevant to indexing.
Maybe there are 1000 user groups, but ony a couple of them are really relevant to indexing.
Pseudo piped code: "SELECT DISTINCT usergroups FROM fe_users | accessGroupFilterFunction" results in a small couple of really used usergroup combinations. The mechanism of the function could work exactly like the cHash GET params filtering: Bring user groups in a reproducible order and filter out those that are not meant to influence the access level.

The search can do the very same with the current fe_users user groups: Pipe them through the accessGroupFilterFunction and use them as filter parameter for the solr search.

Of course I would not do the "SELECT DISTINCT" stuff for each indexing run. Instead I would suggest to make it either configurable or cacheable. Especially I would try to avoid dynamically changing setups: If there is a valid user group combination that isn't present when indexing the first time but appears some time due to new user registration, this would enforce a suddenly new indexing run with maybe thousands of records.

Example for filtering:
User groups 1, 2 and 3 are available. Only 1 and 2 influence the cache.
Index creates index entries for "", "1", "2" and "1,2".
An fe_user with groups "1,3" gets filtered to "1".

If there are 1000 user groups but only 2 and 2 influence the cache, the number of solr documents is exactly the same.

I really think this can be implemented right ontop of the current solr extension.

To be honest, I don't need this currently. So I'm not going to implement it. Especially not without a proper setup/project to test the results. But I do see the problem and I do believe that it should be solved somehow different then "live with it".

Additionally: I would suggest to make this optional. There are setups currently running very well, this new access level filtering behavior should not be default but should require a certain extension manager flag to be set.


Kind regards,



Stephan Schuler

Web-Entwickler

Telefon: +49 (911) 539909 - 0
E-Mail: Stephan.Schuler at netlogix.de
Website: media.netlogix.de


--
netlogix GmbH & Co. KG
IT-Services | IT-Training | Media
Andernacher Straße 53 | 90411 Nürnberg
Telefon: +49 (911) 539909 - 0 | Fax: +49 (911) 539909 - 99
E-Mail: info at netlogix.de | Internet: http://www.netlogix.de

netlogix GmbH & Co. KG ist eingetragen am Amtsgericht Nürnberg (HRA 13338)
Persönlich haftende Gesellschafterin: netlogix Verwaltungs GmbH (HRB 20634)
Umsatzsteuer-Identifikationsnummer: DE 233472254
Geschäftsführer: Stefan Buchta, Matthias Schmidt



-----Ursprüngliche Nachricht-----
Von: typo3-project-solr-bounces at lists.typo3.org [mailto:typo3-project-solr-bounces at lists.typo3.org] Im Auftrag von Dmitry Dulepov
Gesendet: Montag, 4. März 2013 12:20
An: typo3-project-solr at lists.typo3.org
Betreff: Re: [TYPO3-Solr] Problem: the same page appears twice in search results

Hi!

Olivier Dobberkau wrote:
> Please have a look how we extract the usergroups from the actual page.
> we do not index bruteforce.

Solr simply gets the list of groups from content elements and pages in the rootline, that's all. I know that :)

--
Dmitry Dulepov
TYPO3 CMS core & security teams member

Love gorillas.
_______________________________________________
TYPO3-project-solr mailing list
TYPO3-project-solr at lists.typo3.org
http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-project-solr
Olivier Dobberkau wrote:
> Please have a look how we extract the usergroups from the actual page.
> we do not index bruteforce.

Solr simply gets the list of groups from content elements and pages in the rootline, that's all. I know that :)

--
Dmitry Dulepov
TYPO3 CMS core & security teams member

Love gorillas.
_______________________________________________
TYPO3-project-solr mailingso


More information about the TYPO3-project-solr mailing list