[TYPO3-Solr] How to handle changes on related values

Stephan Schuler Stephan.Schuler at netlogix.de
Wed Oct 10 18:12:55 CEST 2012


Hey there.


We have several index queue configurations where a single solr document is not only related to a single TYPO3 record but a couple of records. And we have some index queue configurations with relations to other stuff then only files or records.

Some of them are done by SOLR_RELATION, others aren’t.

To have a simple example, think about tx_news_domain_model_news records for news that are related to tx_news_domain_model_category records. One news has a category_stringM where the category title of a single news category record is stored.
We’re using _stringM by purpose: We’re doing full text search on categories as well.
So currently our index contains several _stringM fields that are related to other TYPO3 records but don’t contain the corresponding record uids.
Our facets show the raw value from the solr result for both, the filter GET parameter and the display value.

But that’s only a simple example which could be solved by using both, _stringM fields for full text search and _integerM fields for the actual relation. There are others where we don’t even have the chance to use _integerM relations. Just take this as a fact. So “avoid string facets” isn’t an option.

This currently works perfectly fine, as long as nobody touches the categories. As soon as a category gets changed, say, hidden or renamed, the TYPO3 backend shows the changed value, but the solr doesn’t. When new “news” records relate to the renamed category, new solr documents start showing the new category title value, but old news document in solr still remain referring to the old values. Customers will start to ask “why is this category still present, I just deleted it” or “why is this name not yet changed, and when will that happen”.

Is there a solution for that problem?


I thought about introducing a solr expiration, configured through TypoScript.
The tx_solr_indexqueue_Queue::getItemsToIndex could take the “changed” value into relation to the current time and return not only “changed > indexed AND errors = ‘’ ” but “(changed > indexed AND errors = ‘’) OR changed < $expirationDate”.

I’m aware of the fact that this doesn’t really target my problem but only works around it. But it’s a solution that works with “all kinds of related data”. So I think I’ll implement this.


Really solving my problem would include some kind of cache invalidation configuration. Example:

plugin.tx_solr.index.queue {
            fields {
                        myOwnField_stringS = COA
                        myOwnField_stringS {
                                   10 = TEXT
                                   10.value = Can be everything
                                   20 = SOLR_RECORD_RELATION_WATCHER
                                   20.uid = 12345
                                   20.table = tx_example_domain_model_foobar
                                   30 = SOLR_CUSTOM_RELATION_WATCHER
                                   30.userFunc = EXT:example/myservice
}
}
}

What do you think about that? I’m not going to implement this tomorrow. Maybe I never will. But I really want to know what you think about this kind problem.

Regards,

Stephan Schuler
Web-Entwickler

Telefon: +49 (911) 539909 - 0
E-Mail: Stephan.Schuler at netlogix.de
Website: media.netlogix.de<http://media.netlogix.de>

--
netlogix GmbH & Co. KG
IT-Services | IT-Training | Media
Andernacher Straße 53 | 90411 Nürnberg
Telefon: +49 (911) 539909 - 0 | Fax: +49 (911) 539909 - 99
E-Mail: info at netlogix.de<mailto:info at netlogix.de> | Internet: www.netlogix.de<http://www.netlogix.de/>

netlogix GmbH & Co. KG ist eingetragen am Amtsgericht Nürnberg (HRA 13338)
Persönlich haftende Gesellschafterin: netlogix Verwaltungs GmbH (HRB 20634)
Umsatzsteuer-Identifikationsnummer: DE 233472254
Geschäftsführer: Stefan Buchta, Matthias Schmidt


More information about the TYPO3-project-solr mailing list