[TYPO3] crawler_cli.phpsh not working?

J.J.W. Witteman jwittema at yahoo.com
Tue Jun 13 15:38:32 CEST 2006


Thanks for the reply. 

I too hardcoded it's own path in the script and followed the
manual to configure the crawler.
(http://typo3.org/documentation/document-library/extension-manuals/crawler/current/view/1/3/)
Did you set any special options for the backend user
'_cli_crawler'?

As far as I can tell the crawler is correctly configured, because
after running it a couple of times from the backend, everything I
want indexed gets indexed. However I don't want to click the "run
now" button every day ;-)

To answer your question: you can prevent your menu on every page
being indexed by placing <!--TYPO3SEARCH_begin--> and
<!--TYPO3SEARCH_end--> markers in your template around the
part(s) you do want to get indexed (ie. the content).

Regards, Jeroen

> I had problems to get it running at all.
> First of all on this server for some reason I could't 'execute'
> the crawler in cli mode at all, so I wrote a little helper
> script. SOmething in the line as php -q
> path/to/the/crawler.cli.php
> 
> Second of all in cli mode the crawler could not figure out it's
> own path name so I had to hard code it. The config file wasn't
> found.
> 
> After that was solved I had to go in the backend and prepare
> 'something' for crawler mode. I wasn't really clear about that
> but following the crawler manual did the trick.
> 
> It some point when I started the the crawler it took about 2-3
> minutes before my site was indexed.  When I started it right
> after wards it was fast and returned directly. I think it was
> not scheduled yet.  However I couldn't figure out clearly when
> it was scheduled for the next time. I need to test this I
> think.  The next schedule time stayed empty and I didn't
> understand why.  These where my first baby crawler
> steps........ I need to play around with it more.
> 
> I also need to badly tune indexed-search. For example if I look
> for the word 'downloads' on my site now It will give me all
> pages back with the word download on the page. means all pages
> since he get's it from the menu!!! When I go to typo3.org and
> do the same I basicly get the same results, means a lot of the
> same pages so I am not sure if I can solve it... Or typo3 is
> misconfiguration... I don't know...  I need to play much more
> with that search engine I think, or I will install mnogosearch
> for this client.
> 
> So far.......
> 
> Ries
> 
>> Still no answer...
>> Doesn't anybody know the answer or does nobody understand
>> the problem?
>>
>> When running the the crawler_cli.phpsh no error messages are
>> shown and nothing happens (crawler queue doesn't get processed).
>>
>> When adding some debug echo statements to the script I see
>> that it doesn't seem to get past the 'init.php' line:
>>
>> --- crawler_cli.phpsh ---
>> [...]
>> echo "Debug 1\n";
>> // Include init file:
>> require(dirname(PATH_thisScript).'/'.$BACK_PATH.'init.php');
>> echo "Debug 2\n";
>> [...]
>>
>> This will result in "Debug 1" getting printed in the console
>> but "Debug 2" doesn't. What could be wrong?
>>
>> Thanks
>>
>>   
>>> I can't get the crawler to run from the command line.
>>> Everything else seems to work fine. Entries are added to the
>>> crawlers queue and when I press the "Run now" button in the
>>> backend they get processed. However when I run the crawler
>>> from the command line nothing happens. The "Last seen" time
>>> does not get updated.
>>>
>>> I'm calling the script with the full path as shown in the
>>> backend:
>>> '/data/www/intraweb/typo3conf/ext/crawler/cli/crawler_cli.phpsh'.
>>> No error messages are shown. I did create a backend user
>>> named '_cli_crawler'. Does that user need any special
>>> options set? What about the password for that user?
>>>
>>> I tried setting the PATH_thisScript variable manually to the
>>> correct value, but that didn't help either.
>>>
>>>
>>>     
>>>
define('PATH_thisScript','/data/www/intraweb/typo3conf/ext/crawler/cli/crawler_cli.phpsh');
>>>   
>>> Any advice?
>>>
>>> Thanks.


jwittema at yahoo.com

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



More information about the TYPO3-english mailing list