[TYPO3] Crawler not working as I would expect it.
Walrick Bosch
lists at globalhealingcircle.net
Thu Sep 18 14:28:37 CEST 2008
Hello,
I'n installed that Crawler extension and am now trying to get it to
work. I use one index configuration which indexes the page tree.
I have set the following configuration on the PageTSconfig root page of
the site:
tx_crawler.crawlerCfg.paramSets {
language = &L=[|_TABLE:pages_language_overlay;_FIELD:sys_language_uid]
language.procInstrFilter = tx_indexedsearch_reindex,
tx_indexedsearch_crawler
language.baseUrl = http://www.globalhealingcircle.net/
}
I'm not sure if this is enough/correct.
But tha main thing is, when I look at the crawler log after a couple or
runs, I'm getting a lot of lines like the following, but just behind the
root page of the site.
3542 18-09-08 14:06:16 18-09-08 14:06:16 OK 1 [Index Cfg UID#1]
128152761
The lines for the other pages stay empty at first. The after a while
they start getting lines like:
Agenda 3570 18-09-08 14:06:30 - ..
http://www.globalhealingcircle.net/index.php?id=1706
tx_indexedsearch_reindex; tx_indexedsearch_crawler 0
Is it normal to get so many lines behind the root page without a full URL?
I notice that the lines behind the other pages have the full URL, but
instead of [Index Cfg UID#1 they get tx_indexedsearch_reindex;
tx_indexedsearch_crawler
-----
Also our hosting provider has set the cron job as follows:
*/10 * * * * username nice -n+19 php -q /...../cli_dispatch.phpsh
crawler >/dev/null 2>/dev/null
With the right path and username of course. But if I look at the CLI
Status page nothing seems to happen. It I just enter
"/...../cli_dispatch.phpsh crawler" using SSH it works fine. (The user
_cli_lowlevel exists.)
Any idea why?
I'd be grateful for any help.
Regards,
Walrick
--
webmaster Global Healing Circle
www.globalhealingcircle.net
--
webmaster Global Healing Circle
www.globalhealingcircle.net
More information about the TYPO3-english
mailing list