[TYPO3-dev] indexed_search + crawler + Extbase-based extension's data

Xavier Perseguers typo3 at perseguers.ch
Thu Sep 30 16:31:01 CEST 2010


Hi,

It seems I'm facing limitations of both indexed_search and crawler when 
it comes to "complex" configuration of indexing database records.

Before I start creating patches, I'd like to know whether I'm the first 
one facing those problems or if I simply missed some other way to 
configure the indexing.

Problem:

I have an extension managing companies. As each company has many related 
records, I created a dedicated sysfolder for each company where I store 
the Extbase domain object "Company" along with all its related entities. 
This results into something like that in page tree:

.
`-- [sysfolder] Companies
        |-- [sysfolder] ABC
        |-- [sysfolder] MS
        |-- [sysfolder] Apple
        `-- [sysfolder] ...

Each of these sysfolders contains a Company record with the same name 
(aka "ABC", "MS", "Apple", ...) and stored within table

tx_myext_domain_model_company

To show those records, I have a plugin on a page which shows the "index" 
of all companies and the details of a single one, by changing the 
Extbase action of the plugin. Tree is such like that:

.
|-- page with company plugin
`-- [sysfolder] Companies
        |-- [sysfolder] ABC
        |-- [sysfolder] MS
        |-- [sysfolder] Apple
        `-- [sysfolder] ...

This plugin has Starting Point set to sysfolder "Companies" with 1 level 
/ infinite recursion (this does not matter yet). This effectively shows 
all companies. So far so great.

Now, I want to index data from my companies, let's say simply their name.

As such I create an Indexing Configuration record on page containing the 
plugin:

Type: Database Records
Table to index: Company
Alternative source page: HERE IS THE PROBLEM
Fields: name
GET parameter string: 
&tx_myext_pi1[company]=###UID###&tx_myext_pi1[action]=show&tx_myext_pi1[controller]=Company
Calculate cHash: YES

First problem here, as my records are stored not in a single sysfolder 
but in multiple ones, indexed_search cannot index them at once as it 
searches for records either on current page (if Alternative Source Page 
is undefined, or on the one that is defined). It does not allow me to 
select my "Companies" sysfolder and set some recursion on it.

Now, in order to index those records with crawler, I have to create a 
crawler configuration. I'm using here the Crawler configuration record 
that is modern even if the documentation still only shows screenshots of 
the pageTS configuration.

I put this configuration on my root page:

Name: my-companies
Processing instruction filter: Re-indexing [tx_indexedsearch_reindex]
Configuration: 
&tx_myext_pi1[company]=[_TABLE:tx_myext_domain_model_company]&tx_myext_pi1[action]=show&tx_myext_pi1[controller]=Company

For those not aware of what crawler actually does with this 
configuration is generating GET parameters where parameter 
tx_myext_pi1[company] will get each and every UID value found for _TABLE 
tx_myext_domain_model_company, effectively indexing each and every 
company with the corresponding Indexing Configuration.

Again, it misses the subfolders.

I may add a _PID clause to the crawler configuration but I had to patch 
it in order to "remove" the pid WHERE clause, effectively generating 
indexing URL for all companies found in the database but they are not 
indexed as they do not match the storage folder defined in the Indexing 
Configuration.

So question is: is there an easy way to do that? Creating as many 
Indexing Configuration as I have companies (and thus sysfolder for them) 
is not an option!

Xavier




More information about the TYPO3-dev mailing list