[TYPO3]  Crawler not working as I would expect it.
    Walrick Bosch 
    lists at globalhealingcircle.net
       
    Thu Sep 18 14:28:37 CEST 2008
    
    
  
Hello,
I'n installed that Crawler extension and am now trying to get it to 
work. I use one index configuration which indexes the page tree.
I have set the following configuration on the PageTSconfig root page of 
the site:
tx_crawler.crawlerCfg.paramSets {
language = &L=[|_TABLE:pages_language_overlay;_FIELD:sys_language_uid]
language.procInstrFilter = tx_indexedsearch_reindex, 
tx_indexedsearch_crawler
language.baseUrl = http://www.globalhealingcircle.net/
}
I'm not sure if this is enough/correct.
But tha main thing is, when I look at the crawler log after a couple or 
runs, I'm getting a lot of lines like the following, but just behind the 
root page of the site.
3542  18-09-08 14:06:16  18-09-08 14:06:16  OK  1 [Index Cfg UID#1] 
128152761
The lines for the other pages stay empty at first. The after a while 
they start getting lines like:
Agenda  3570  18-09-08 14:06:30  -  .. 
http://www.globalhealingcircle.net/index.php?id=1706 
tx_indexedsearch_reindex; tx_indexedsearch_crawler  0
Is it normal to get so many lines behind the root page without a full URL?
I notice that the lines behind the other pages have the full URL, but 
instead of [Index Cfg UID#1 they get tx_indexedsearch_reindex; 
tx_indexedsearch_crawler
-----
Also our hosting provider has set the cron job as follows:
*/10 * * * * username nice -n+19 php -q /...../cli_dispatch.phpsh 
crawler >/dev/null 2>/dev/null
With the right path and username of course. But if I look at the CLI 
Status page nothing seems to happen. It I just enter 
"/...../cli_dispatch.phpsh crawler" using SSH it works fine. (The user 
_cli_lowlevel exists.)
Any idea why?
I'd be grateful for any help.
Regards,
Walrick
-- 
webmaster Global Healing Circle
www.globalhealingcircle.net
-- 
webmaster Global Healing Circle
www.globalhealingcircle.net
    
    
More information about the TYPO3-english
mailing list