Polite spiders should obey a robots.txt file. The bad ones, well, you
send the owner a nasty email and/or do your best to block them.
Regards,
Dan
···
-----Original Message-----
From: Alexandru Popescu [mailto:the.mindstorm.mailinglist@gmail.com]
Sent: Friday, September 16, 2005 12:24 PM
To: ruby-talk ML
Subject: Re: A big project
#: Eric Hodel changed the world a bit at a time by saying on
9/16/2005 8:13 PM :#
> On 15 Sep 2005, at 12:04, David Heinemeier Hansson wrote:
>
>> http://www.43things.com/
>> http://www.43places.com/
>>
>> The sites all fall within the category of your references
and they're
>> made with Ruby on Rails. Many of them do more than 400K
pageviews per
>> day. I know at least one of them is nearing one million.
>
> Our sites (combined) handled over 1 million requests due to
a naughty
> spider for a couple days.
>
Have you solved it? I would be very interested to find out
how. Can you post some hints how can you
protect against these spiders?
#: Berger, Daniel changed the world a bit at a time by saying on 9/16/2005 8:46 PM :#
···
-----Original Message-----
From: Alexandru Popescu [mailto:the.mindstorm.mailinglist@gmail.com] Sent: Friday, September 16, 2005 12:24 PM
To: ruby-talk ML
Subject: Re: A big project
#: Eric Hodel changed the world a bit at a time by saying on 9/16/2005 8:13 PM :#
> On 15 Sep 2005, at 12:04, David Heinemeier Hansson wrote:
> >> http://www.43things.com/
>> http://www.43places.com/
>>
>> The sites all fall within the category of your references and they're >> made with Ruby on Rails. Many of them do more than 400K pageviews per >> day. I know at least one of them is nearing one million.
> > Our sites (combined) handled over 1 million requests due to a naughty
> spider for a couple days.
>
Have you solved it? I would be very interested to find out how. Can you post some hints how can you protect against these spiders?
Polite spiders should obey a robots.txt file. The bad ones, well, you
send the owner a nasty email and/or do your best to block them.
He, about this 3rd solution I was asking, cause it doesn't seem to be a trivial one ;-).
./alex
--
[.the_mindstorm.]
Regards,
Dan
Apache provides the 3rd solution. There's plenty of hits on google.
···
On 16 Sep 2005, at 12:15, Alexandru Popescu wrote:
#: Berger, Daniel changed the world a bit at a time by saying on 9/16/2005 8:46 PM :#
-----Original Message-----
From: Alexandru Popescu [mailto:the.mindstorm.mailinglist@gmail.com] Sent: Friday, September 16, 2005 12:24 PM
To: ruby-talk ML
Subject: Re: A big project
#: Eric Hodel changed the world a bit at a time by saying on 9/16/2005 8:13 PM :#
> On 15 Sep 2005, at 12:04, David Heinemeier Hansson wrote:
> >> http://www.43things.com/
>> http://www.43places.com/
>>
>> The sites all fall within the category of your references and they're >> made with Ruby on Rails. Many of them do more than 400K pageviews per >> day. I know at least one of them is nearing one million.
> > Our sites (combined) handled over 1 million requests due to a naughty
> spider for a couple days.
> Have you solved it? I would be very interested to find out how. Can you post some hints how can you protect against these spiders?
Polite spiders should obey a robots.txt file. The bad ones, well, you
send the owner a nasty email and/or do your best to block them.
He, about this 3rd solution I was asking, cause it doesn't seem to be a trivial one
--
Eric Hodel - drbrain@segment7.net - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04