[TYPO3-english] Crawler and multiple news pages (Does a 'real' crawler exist?)

Bartbogdan bdubelaar at sundayafternoon.nl
Fri Jun 19 14:40:28 CEST 2009


Hi all,

We are currently working on a site with mulitple single news pages. 
There are multiple news categories and news items from certain 
categories are shown on different single news pages.
The problem that arises is that the crawler for indexed search will 
index all available news items on every single news page. So every news 
article will be listed as many times as the amount of single news pages 
that exist.

Currently I see 2 possible ways to solve this problem:
- limit the single news view to a specific category (like one can do in 
LIST and LATEST view). It seems that the SINGLE tt_news view does not 
respect a category selection.
- Find a method to 'crawl' the site like 'crawling' is intended. If the 
crawler would just traverse through all the links on the website, then 
every page would be found and everything would turn out fine. Every 
news item will the only be found in the right places.
As the crawler works right now (correct me if im wrong), it is not 
actually a crawler. You just hand over the URL's it should look at, it 
is not actually crawling through the links on your pages.

What are your thoughts about this? Is there any way to solve this 
problem? By restricting the single news view, or by crawling in a 
different way?

Best regards,

Bart




More information about the TYPO3-english mailing list