Select loop question

Hi,

I’ve got a question on using a select loop. Here is the basic way I
am using select:

if (answer = select(readfds, writefds, nil, 10))
answer[0].each do |r|
# handle reads
end
answer[1].each do |w|
# handle writes
end
else

timeout

end

The question is: How do I know which file descriptors were returned
in the answer arrays? That is: after I get an answer and I finish
reading from the socket, I want to remove that socket from the readsfs
array, but I’m not sure how to do that. Do I need to do something
like:
answer[0].each do |r|
readfds.each_with_index do |orig,i|
if r == orig
# nuke this one
readfds.slice!(i)
end
end
end

That seems like an awful lot of overhead (cartesian product?) which
I why I think I’m doing it wrong…

Does anyone have some example code using a select loop? I would like
to use it to multiplex connections to web servers. That is, I have
10000 url’s that I would like to retrieve the contents from, and would
like to do it using 50 parallel connections using select loop. I know
you can use threads for such an operation, but I would prefer to use
the select loop method.

thanks,
-joe

[Joseph McDonald]:

after I get an answer and I finish
reading from the socket, I want to remove that socket from the readsfs
array, but I’m not sure how to do that. Do I need to do something
like:
answer[0].each do |r|
readfds.each_with_index do |orig,i|
if r == orig
# nuke this one
readfds.slice!(i)
end
end
end

Just do

answer[0].each {|r| readfds.delete(r)}

Or even

readfds = readfds - answer[0]

// Niklas

Just do

        answer[0].each {|r| readfds.delete(r)}

Thanks, I do notice that Array.delete seems to be very slow, If I run this:

hits = (200..300).to_a
1.upto(1000) do

  arr =
  1.upto(500) do |x|
    arr.push(x)
  end

  # hits.each do |h|
  # arr.delete(h)
  # end

end

I get:
     0.47 real 0.46 user 0.00 sys
If I uncomment out the delete portion I get:
      8.74 real 8.23 user 0.00 sys
      
My guess is that it is doing the cartesion product thing, interesting
that it is 8 times slower than:

  arr.each do |x|
    if x >= 200 and x <= 300
      arr.slice!(x)
    end
  end

which gives: 1.09 real 1.06 user 0.00 sys

another surprising thing, if I use: arr = arr - hits

gives: 24.36 real 21.91 user 0.03 sys

thanks,
-joe

[Joseph McDonald]:

Just do
answer[0].each {|r| readfds.delete(r)}

Thanks, I do notice that Array.delete seems to be very slow, If I run this:

hits = (200…300).to_a
1.upto(1000) do

arr =
1.upto(500) do |x|
arr.push(x)
end

hits.each do |h|

arr.delete(h)

end

end

I get:
0.47 real 0.46 user 0.00 sys
If I uncomment out the delete portion I get:
8.74 real 8.23 user 0.00 sys

My guess is that it is doing the cartesion product thing, interesting
that it is 8 times slower than:

arr.each do |x|
if x >= 200 and x <= 300
arr.slice!(x)
end
end

which gives: 1.09 real 1.06 user 0.00 sys

If you do many deletes, it is slow. It has to search the array for
the element to delete and then move all the data that follows it.

Note that slice! is not the same thing, when you do slice!(x) you remove
the element at position x, not the element with content x. So the
array does not have to be searched.

If you want to do fast deletes based on content, you can use a hash to store
the data. Store the data in the key of the hash and a dummy value such as
“true” in the value. You get fast lookups and there is no need to move
elements.

// Niklas

If you do many deletes, it is slow. It has to search the array for
the element to delete and then move all the data that follows it.

Note that slice! is not the same thing, when you do slice!(x) you remove
the element at position x, not the element with content x. So the
array does not have to be searched.

If you want to do fast deletes based on content, you can use a hash to store
the data. Store the data in the key of the hash and a dummy value such as
"true" in the value. You get fast lookups and there is no need to move
elements.

This is 4 times as fast as arr.delete(obj) on a 500 element array, it
probably gets faster as the array grows:

hits = (200..300).to_a
1.upto(1000) do

  arr =
  1.upto(500) do |x|
    arr.push(x)
  end

  hash = Hash.new
  arr.each_with_index do |a,i|
    hash[a] = i
  end

  hits.each do |hit|
    arr.slice!(hash[hit]) if hash[hit]
  end
end

2.13 real 2.05 user 0.00 sys

I wonder if Array#- should do the same thing internally? The above
method is 12 times as fast as Array#-

But I digress... I'm still struggling with select()... :slight_smile: I can't
find any examples except in "server" code.

thanks,
-joe