[TYPO3-core] RFC: Bug #7787: Yahoo, MSN, Ask.com and Alexa are not recognized in TypoScript robot condition
Steffen Kamper
steffen at sk-typo3.de
Sat Mar 8 13:01:41 CET 2008
"Oliver Hader" <oliver at typo3.org> schrieb im Newsbeitrag
news:mailman.1.1204975610.13424.typo3-team-core at lists.netfielders.de...
> Hi Steffen,
>
> Steffen Kamper schrieb:
>> "Oliver Hader" <oliver at typo3.org> schrieb im Newsbeitrag
>> news:mailman.1.1204974421.13424.typo3-team-core at lists.netfielders.de...
>>> Solution:
>>> Add parts of the user agent string of the HTTP header to recognize them
>>> correctly:
>>> * Yahoo -> slurp
>>> * MSN -> msnbot
>>> * Ask.com -> teoma
>>> * Alexa -> ia_archiver
>>
>> +1 by reading. In general i would like to have such data outsourced from
>> code, having txtfiles eg for easy maintanance, But for 4.2 this should be
>> left as it is.
>
> I checked for user agent strings on http://www.user-agents.org/ and had a
> similar feeling. Currently we only have some of the famous robots/crawlers
> but there are still many others out there. I don't say that we need all of
> them, but just some more.
>
yes, good idea to get the list from external, could be saved in temp, update
on request in EM for example.
They deliver a xml which could be easy used:
http://www.user-agents.org/allagents.xml
vg Steffen
More information about the TYPO3-team-core
mailing list