[TYPO3-english] Crawler and external documents

Claudio Strizzolo claudio.strizzolo at ts.nogarb.ageinfn.it
Wed Jan 28 09:26:51 CET 2009


Hi all
I'm trying to set up the crawler extension in order to index all the pages
in the site and the external documents (/fileadmin/...) linked by anchors 
in the pages.
I read some documentation, included http://wiki.typo3.org/index.php/
Ext_crawler and almost everything works: the pages are correctly indexed, 
and the external documents are recognized. In the Crawler Log they are 
listed in separate rows under the page which points to them.
However, their status is ".." and their contents are not indexed. If I 
click on the "Read" icon (it looks more like a reload icon, imho) the 
content is correctly indexed and the status becomes "OK", but I could not 
find a way to get this automatically through the crawler.
I have a huge number of documents linked in this way, therefore I would
like to index them without having to click on the Read icon for each of
them.
Is there a way to get this? Probably I missed something stupid in the 
documentation, but I'm puzzled trying to figure it out.
This is the TS config in the root page of the site:

tx_crawler.crawlerCfg.paramSets {
  whole_site =
  whole_site {
    cHash = 1
    procInstrFilter = tx_indexedsearch_reindex, tx_indexedsearch_crawler
    baseUrl = http://www.example.com/
  }
  language = &L=[|_TABLE:pages_language_overlay;_FIELD:sys_language_uid]
  language {
    procInstrFilter =tx_indexedsearch_reindex, tx_indexedsearch_crawler
    baseUrl = http://www.example.com/
  }
  tt_news = &tx_ttnews[tt_news]=[_TABLE:tt_news;_PID:280]
  tt_news {
    procInstrFilter = tx_indexedsearch_reindex, tx_cachemgm_recache
    cHash = 1
    pidsOnly = 301
    baseUrl = http://www.example.com/
  }
}

This is how I run the crawler from the command line:

typo3/cli_dispatch.phpsh crawler_im 34 -d 99 -proc 
tx_indexedsearch_reindex,tx_indexedsearch_crawler -n 2000 -o exec

Thanks in advance

Claudio


More information about the TYPO3-english mailing list