[TYPO3-core] RFC: Bug #7787: Yahoo, MSN, Ask.com and Alexa are not recognized in TypoScript robot condition

Steffen Kamper steffen at sk-typo3.de
Sun Mar 9 22:24:53 CET 2008


"Oliver Hader" <oliver at typo3.org> schrieb im Newsbeitrag 
news:mailman.1.1204975610.13424.typo3-team-core at lists.netfielders.de...
> Hi Steffen,
>
> Steffen Kamper schrieb:
>> "Oliver Hader" <oliver at typo3.org> schrieb im Newsbeitrag 
>> news:mailman.1.1204974421.13424.typo3-team-core at lists.netfielders.de...
>>> Solution:
>>> Add parts of the user agent string of the HTTP header to recognize them
>>> correctly:
>>> * Yahoo -> slurp
>>> * MSN -> msnbot
>>> * Ask.com -> teoma
>>> * Alexa -> ia_archiver
>>
>> +1 by reading. In general i would like to have such data outsourced from 
>> code, having txtfiles eg for easy maintanance, But for 4.2 this should be 
>> left as it is.
>
> I checked for user agent strings on http://www.user-agents.org/ and had a 
> similar feeling. Currently we only have some of the famous robots/crawlers 
> but there are still many others out there. I don't say that we need all of 
> them, but just some more.
>
> olly
> -- 

i had a deeper look at this database and the code we need for it. The 
routine uses a single word that is in the complete string. If we fetch a 
list of user agents we can't do it the other way.

So i think this static list is ok if we update it on request.

vg Steffen 




More information about the TYPO3-team-core mailing list