Threads

I'm rewriting some old Python scripts in Ruby. Just to learn a bit more.
I want to thread this code to check websites concurrently. Could somone
offer advice? What are reasonable limits on threads in Ruby?

    ips.each do |ip|
        begin
            Timeout::timeout(5) do
                Net::HTTP.start(ip) do |site|
                    data = site.get( '/' ).body.downcase
                    if data.include?(minolta) and
data.include?(pagescope)
                        puts printer + "\t" + ip
                    elsif data.include?(xerox) and
data.include?(printer)
                        puts printer + "\t" + ip
                    else
                        puts "website\t" + ip
                    end
                end
            end
        # Don't really care about exceptions just time outs mostly.
        rescue Timeout::Error
            puts "Timeout\t" + ip
        rescue Exception
            puts "General Exception\t" + ip
        end
    end

···

--
Posted via http://www.ruby-forum.com/.

i made a small example here:
http://pastie.caboo.se/19495

^ manveru

···

On Wednesday 25 October 2006 11:37, Brad Tilley wrote:

I'm rewriting some old Python scripts in Ruby. Just to learn a bit more.
I want to thread this code to check websites concurrently. Could somone
offer advice? What are reasonable limits on threads in Ruby?

    ips.each do |ip|
        begin
            Timeout::timeout(5) do
                Net::HTTP.start(ip) do |site|
                    data = site.get( '/' ).body.downcase
                    if data.include?(minolta) and
data.include?(pagescope)
                        puts printer + "\t" + ip
                    elsif data.include?(xerox) and
data.include?(printer)
                        puts printer + "\t" + ip
                    else
                        puts "website\t" + ip
                    end
                end
            end
        # Don't really care about exceptions just time outs mostly.
        rescue Timeout::Error
            puts "Timeout\t" + ip
        rescue Exception
            puts "General Exception\t" + ip
        end
    end

Looks like you're trying to perform a network client operation
simultaneously across a lot of different servers. For a non-threaded
approach, look at the EventMachine library. Sync to the latest source and
look at EventMachine::Deferrable. That should give you considerably more
scalability and performance than trying to solve this with threads. The
Deferrable pattern works like Python's Twisted. If it doesn't make sense to
you, let me know and I can send you some sample code.

···

On 10/24/06, Brad Tilley <rtilley@vt.edu> wrote:

I'm rewriting some old Python scripts in Ruby. Just to learn a bit more.
I want to thread this code to check websites concurrently. Could somone
offer advice? What are reasonable limits on threads in Ruby?

    ips.each do |ip|
        begin
            Timeout::timeout(5) do
                Net::HTTP.start(ip) do |site|
                    data = site.get( '/' ).body.downcase
                    if data.include?(minolta) and
data.include?(pagescope)
                        puts printer + "\t" + ip
                    elsif data.include?(xerox) and
data.include?(printer)
                        puts printer + "\t" + ip
                    else
                        puts "website\t" + ip
                    end
                end
            end
        # Don't really care about exceptions just time outs mostly.
        rescue Timeout::Error
            puts "Timeout\t" + ip
        rescue Exception
            puts "General Exception\t" + ip
        end
    end

--
Posted via http://www.ruby-forum.com/\.

Here's an EventMachine code sample that should do what you need. Notice,
this code is nonthreaded, but it still does all the HTTP GETs
simultaneously. Of course you'll want to do something more interesting in
the http.callback block.

···

On 10/24/06, Brad Tilley <rtilley@vt.edu> wrote:

I'm rewriting some old Python scripts in Ruby. Just to learn a bit more.
I want to thread this code to check websites concurrently. Could somone
offer advice? What are reasonable limits on threads in Ruby?

#-------------------------------------

require 'rubygems'
require 'eventmachine'

$addrs = [
  "www.apple.com",
  "www.cisco.com",
  "www.microsoft.com"
]

def scan_addr addr
  http = EventMachine::Protocols::HttpClient.request(
    :host => addr,
    :port => 80,
    :request => "/"
  )

  http.callback {|response|
    puts response[:status]
    puts response[:headers]
    puts response[:content].length
  }
end

EventMachine.run {
  $addrs.each {|addr| scan_addr addr}
}

#-------------------------------------------------

Michael Fellinger wrote:

i made a small example here:
http://pastie.caboo.se/19495

^ manveru

Thank you for the example!

···

--
Posted via http://www.ruby-forum.com/\.

what would the preferrer way by to shared data from http.callback? does it
need protection? how about sharing with ruby green threads?

cheers.

-a

···

On Fri, 27 Oct 2006, Francis Cianfrocca wrote:

On 10/24/06, Brad Tilley <rtilley@vt.edu> wrote:

I'm rewriting some old Python scripts in Ruby. Just to learn a bit more.
I want to thread this code to check websites concurrently. Could somone
offer advice? What are reasonable limits on threads in Ruby?

Here's an EventMachine code sample that should do what you need. Notice,
this code is nonthreaded, but it still does all the HTTP GETs
simultaneously. Of course you'll want to do something more interesting in
the http.callback block.

#-------------------------------------

require 'rubygems'
require 'eventmachine'

$addrs = [
"www.apple.com",
"www.cisco.com",
"www.microsoft.com"
]

def scan_addr addr
http = EventMachine::Protocols::HttpClient.request(
  :host => addr,
  :port => 80,
  :request => "/"
)

http.callback {|response|
  puts response[:status]
  puts response[:headers]
  puts response[:content].length
}
end

EventMachine.run {
$addrs.each {|addr| scan_addr addr}
}

--
my religion is very simple. my religion is kindness. -- the dalai lama

What are reasonable limits on threads? Say I'm scanning a class b
network (roughly 65K hosts). How would you break up the threads? I seem
to get too many execution expired errors if I have more than 500 hosts
in one thread. It seems to work best with 256 groups of 256 hosts each
or 254 if you exclude the 0's and 255's

What are reasonable limits when working with threads in Ruby? Any tips?

···

--
Posted via http://www.ruby-forum.com/.

I'm not exactly sure what you're asking, Ara. What happens in this code is
that the HTTP requests are fired off simultaneously, and as they complete,
the callback gets called for each completion, always on the same thread.
(There are no additional green or native threads being spun here.) So
there's no contention and no need for mutex protection. If you wanted for
some reason to run this code simultaneously with unrelated code on other
threads, then of course you'd use the normal thread-safe procedures to sync
this data with your other threads.

···

On 10/26/06, ara.t.howard@noaa.gov <ara.t.howard@noaa.gov> wrote:

what would the preferrer way by to shared data from http.callback? does
it
need protection? how about sharing with ruby green threads?

No tips there... it always depends on the task on hand - just use what works
for you :slight_smile:

···

On Thursday 26 October 2006 10:10, Brad Tilley wrote:

What are reasonable limits on threads? Say I'm scanning a class b
network (roughly 65K hosts). How would you break up the threads? I seem
to get too many execution expired errors if I have more than 500 hosts
in one thread. It seems to work best with 256 groups of 256 hosts each
or 254 if you exclude the 0's and 255's

What are reasonable limits when working with threads in Ruby? Any tips?

k - that's the answer i was looking for.

cheers.

-a

···

On Fri, 27 Oct 2006, Francis Cianfrocca wrote:

If you wanted for some reason to run this code simultaneously with unrelated
code on other threads, then of course you'd use the normal thread-safe
procedures to sync this data with your other threads.

--
my religion is very simple. my religion is kindness. -- the dalai lama

You might try using a MapReduce/DRb based approach instead of threads.
Have a look at Starfish (Lucas Carlson | Entrepreneur, Author, and Technology Executive) for ideas.

Farrel

···

On 26/10/06, Michael Fellinger <m.fellinger@gmail.com> wrote:

On Thursday 26 October 2006 10:10, Brad Tilley wrote:
> What are reasonable limits on threads? Say I'm scanning a class b
> network (roughly 65K hosts). How would you break up the threads? I seem
> to get too many execution expired errors if I have more than 500 hosts
> in one thread. It seems to work best with 256 groups of 256 hosts each
> or 254 if you exclude the 0's and 255's
>
> What are reasonable limits when working with threads in Ruby? Any tips?

No tips there... it always depends on the task on hand - just use what works
for you :slight_smile: