[TYPO3-core] RFC: Bug #7787: Yahoo, MSN, Ask.com and Alexa are not recognized in TypoScript robot condition

Steffen Kamper steffen at sk-typo3.de
Sat Mar 8 13:01:41 CET 2008


"Oliver Hader" <oliver at typo3.org> schrieb im Newsbeitrag 
news:mailman.1.1204975610.13424.typo3-team-core at lists.netfielders.de...
> Hi Steffen,
>
> Steffen Kamper schrieb:
>> "Oliver Hader" <oliver at typo3.org> schrieb im Newsbeitrag 
>> news:mailman.1.1204974421.13424.typo3-team-core at lists.netfielders.de...
>>> Solution:
>>> Add parts of the user agent string of the HTTP header to recognize them
>>> correctly:
>>> * Yahoo -> slurp
>>> * MSN -> msnbot
>>> * Ask.com -> teoma
>>> * Alexa -> ia_archiver
>>
>> +1 by reading. In general i would like to have such data outsourced from 
>> code, having txtfiles eg for easy maintanance, But for 4.2 this should be 
>> left as it is.
>
> I checked for user agent strings on http://www.user-agents.org/ and had a 
> similar feeling. Currently we only have some of the famous robots/crawlers 
> but there are still many others out there. I don't say that we need all of 
> them, but just some more.
>

yes, good idea to get the list from external, could be saved in temp, update 
on request in EM for example.
They deliver a xml which could be easy used:
http://www.user-agents.org/allagents.xml

vg Steffen 




More information about the TYPO3-team-core mailing list