[TYPO3-Solr] How to handle changes on related values

Stephan Schuler Stephan.Schuler at netlogix.de
Fri Dec 28 11:06:56 CET 2012


Hey Ingo.

Maybe I was not clear enough. My solution consists of two steps, and they only ship together :).

The first step is the API object I was talking about.
The important part is the incremental filling feature. A new relation parent is initialized, related children are added stepwise in any given order, and finally a "commit" does database operation. All those methods should be public, but that's really technical.

The second step is the integration in the existing record indexing mechanism.
As soon as the record indexer creates a new single record indexing run, it automatically calls the "relation parent start" method.
After the record indexer has finished a single record indexing run, it automatically calls the "relation commit" method.
The existing RECORDS typoscript object as well as the existing SOLR_RELATION typoscript object get hooked to automatically call the "relation add child" method.

Those two steps together allow instant relation tracking with no user interaction.

But IF the user really wants to care about this relation tracking: I would suggest the two additional typoscript objects (SOLR_RECORD_RELATION_WATCHER and SOLR_CUSTOM_RELATION_WATCHER) that allow user customized adding of new child relations. Or we can skip the CUSTOM watcher since my approach deals with a watcher object being t3lib_Singleton. So whoever wants to interact with the raw PHP stuff can simply make t3lib_div::makeInstance and get the existing and just started relation collector.
But of course no one has to use PHP or even custom typoscript for relation tracking, nearly everything should be covered by the second step during regular record tracking and passed to the first step PHP object by that.

If pages count here, too: Maybe we should change the behavior a bit? Currently a page gets indexed as soon as its content gets changed in the backend. That's nice, but maybe not enough. Think about a plugin that fetches remote RSS content and delivers, say, 24h cached content. This results in a single index but changing frontend content every 24 hours. So we could think about some kind of cache expiration feature for records. But that doesn't really matter here. I think it's a completely new task.

Kind regards,
Stephan.



Stephan Schuler
Web-Entwickler

Telefon: +49 (911) 539909 - 0
E-Mail: Stephan.Schuler at netlogix.de
Website: media.netlogix.de


--
netlogix GmbH & Co. KG
IT-Services | IT-Training | Media
Andernacher Straße 53 | 90411 Nürnberg
Telefon: +49 (911) 539909 - 0 | Fax: +49 (911) 539909 - 99
E-Mail: info at netlogix.de | Internet: http://www.netlogix.de

netlogix GmbH & Co. KG ist eingetragen am Amtsgericht Nürnberg (HRA 13338)
Persönlich haftende Gesellschafterin: netlogix Verwaltungs GmbH (HRB 20634)
Umsatzsteuer-Identifikationsnummer: DE 233472254
Geschäftsführer: Stefan Buchta, Matthias Schmidt



-----Ursprüngliche Nachricht-----
Von: typo3-project-solr-bounces at lists.typo3.org [mailto:typo3-project-solr-bounces at lists.typo3.org] Im Auftrag von Ingo Renner
Gesendet: Freitag, 28. Dezember 2012 06:39
An: typo3-project-solr at lists.typo3.org
Betreff: Re: [TYPO3-Solr] How to handle changes on related values

Am 27.12.12 02:44, schrieb Stephan Schuler:
> Hi Ingo.
>
>
> Thank you for answering my question. For I minute, I thought you would just ignore this because my "workaround" is the way to go.
>
>
> On the one hand I think ref index is kind of unreliable.

True, that's also why we went for the Index Queue back then.

> I don't know if things have changed since extbase rose because I didn't touch the ref index since 4.2 or so.

I think it got better through IRRE and especially FAL. Still would need to check that tough.

> But when I tried to use ref index the last time, I had several extensions that modified database tables without updating the ref index table. So using ref index should work when you deal with records getting modified by backend only, but everything else feels kind of unpredictable to me. I know that this situation is relatRed to bad programming of foreign extensions and we should not care about stuff that works against the stable API, but if this means a certain percentage of use cases conflicts and behaves unpredictable crazy, we should do the best to avoid this. A not properly managed ref index should be really hard to discover when the error message is "some records are only indexed partially".

Agreed.

> And on the other hand: I don't know if we should restrict the "foreign record tracking" behavior to records that have TCA relations to a record.
> Think about a relation from Tx_Myext_Record to tt_content. That's something the ref index will cover.
> But think about the related tt_content being either a templavoila record or of type "core records" (don't know its exact name, but there definitely is a default core tt_content which collects a couple of UIDs and passes them to the cObject "RECORD" when being rendered). Then you will keep track of relations to the tt_content container record, but this doesn't necessarily change when child records get changed.
> As soon as the child record has relations itself and gets passed to the indexer by default core rendering mechanisms (RECORDS/CONTENT), you completely lose the child-child records that might be important for the rendering output. They influence the rendering output, but you don't know about them and therefore cannot track them. I guess very often we will end up collecting almost every record in the database when we rely on ref index.

Right, that's where the flexibility of page rendering in TYPO3 bites us.
That's also why we chose for FE indexing for pages... it's just too flexible.

> Another thing: Not all child record update actions really do influence the rendering output. I think the very most relations a record has don't go into solr.

True, but not really a problem when indexing records, as that's usually fast.

>
> And the last thing is very technical: Management of indexing queue filled with both, indexing records and relation records. The indexing queue currently holds the lastIndexTime, then it decides by a simple sql query if a record is younger then the lastIndexTime. When we add relations to the tracking mechanism, we either add them to the indexing_queue table as well and introduce a type field or we create a dedicated relations table. But in both cases, we need some place to track the lastIndexeTime for the related record.

Uhh true! That's the case for pages already. If you change a tt_content record it'll change the page's last updated time to the most recent change time of the tt_content elements on that page.

> So, it's up to you what you want to do. But as you can see, I really would not go ref index but a custom relation tracker.

A relation tracker in general sounds like a good idea. In the end I would still like to follow a principle that we have followed until know:
For the regular integrator it should be very easy and fast to achieve results. So whenever possible I would like to avoid having people to touch PHP and do most(/all) things through TS configuration.


Ingo

--
Ingo Renner
TYPO3 Core Developer, Release Manager TYPO3 4.2, Admin Google Summer of Code

TYPO3 - Open Source Enterprise Content Management System http://typo3.org

Apache Solr for TYPO3 -
Open Source Enterprise Search meets Open Source Enterprise CMS http://www.typo3-solr.com _______________________________________________
TYPO3-project-solr mailing list
TYPO3-project-solr at lists.typo3.org
http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-project-solr
hrough TS configuration.


Ingo

--
Ingo Renner
TYPO3 Core Developer, Release Manager TYPO3 4.2, Admin Google Summer of Code

TYPO3 - Open Source Enterprise Content Management System http://typo3.org

Apache Solr for TYPO3 -
Open Source Enterprise Search meets Open Source Enterprise CMS http://www.typo3-solr.com _______________________________________________
TYPO3-project-solr mailing list
TYPO3-project-solr at lists.typo3.org
http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-proj


More information about the TYPO3-project-solr mailing list