Fixing Net::TFTP

Hi all -

I'm extremely new to Ruby programming. Forgive me.

I'm trying to develop an updated Net::TFTP library. The existing
library doesn't seem to work correctly with the few embedded devices
that I've tried it with. It's quite old and uses Timeout::timeout()
while waiting on IO, and this never, ever seems to work (timeout
exception is always thrown, whether I am writing or reading).

I've been rewriting the library to use IO.select() in place of all of
the old Timeout::timeout() statements. I'm finding that its behavior is
really weird, though.

Here is my snippet of modified Net::TFTP.putbinaryfile:

    def putbinary(remotefile, io, &block) # :yields: data, seq
      s = UDPSocket.new
      peer_ip = IPSocket.getaddress(@host)
      puts "putting binary file to ", peer_ip
      peer_tid = nil
      seq = 0
      from = nil
      data = nil
      while TRUE do
        s.send(wrq_packet(remotefile, "octet"), 0, peer_ip, @port)
        puts "(Re-)Sent fwrite request, waiting for response"
        a = IO.select([s], nil, nil, 1)
        if a
          puts "."
          packet, from = s.recvfrom(2048,0)
          puts "."
          puts "received packet " , packet, " from ", from
          next unless peer_ip == from[3]
          type, block, data = scan_packet(packet)
          break if (type == OP_ERROR) || (type == OP_OPACK) || ((type ==
OP_ACK) && (block == seq))
        end
      end

My output is:
<I see the write request packet go out via wireshark>
<0.04 seconds later, I see an ACK from the tftp server>
Sent fwrite request, waiting for response
.
<The program then pauses for 5 seconds>
.
received packet....yadda yadda yadda

The same delay occurs when I actually send packets of the file (and if
the file is big, the 5-second delays between blocks is horrifyingly
slow).

It seems that recvfrom() still blocks, even though IO.select should not
return a non-nil response unless there is data to be read on the socket?
Setting different maximum sizes to recvfrom() doesn't change the
behavior.

I wrote a separate program doing the exact same thing with IO.select()
followed by recvfrom(). It then s.send()'s the data back to the client,
basically making a UDP echo service. Connecting to that with netcat
yields exactly what I'd expect: the data is echo'd back to netcat
immediately, not after a several-second delay.

If it's any help, I'm running Ruby 1.8.7...

I'd appreciate any help that folks can provide, even if it's just a
pointer to some other documentation that I should read. From what I've
read, IO.select() should do what I want, though?

Thanks,
Reid

···

--
Posted via http://www.ruby-forum.com/.

What happens if you set non-blocking on the socket and try to catch an exception on the recv?

require 'fcntl'
...

s = UDPSocket.new

s.fcntl(Fcntl::F_SETFL, s.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
...
begin

   packet, from = s.recvfrom(2048,0)

rescue StandardError => e
   puts "Waarg: #{e}"
   next
end
...

Sam

···

On 13/12/11 17:05, Reid Wightman wrote:

Hi all -

I'm extremely new to Ruby programming. Forgive me.

I'm trying to develop an updated Net::TFTP library. The existing
library doesn't seem to work correctly with the few embedded devices
that I've tried it with. It's quite old and uses Timeout::timeout()
while waiting on IO, and this never, ever seems to work (timeout
exception is always thrown, whether I am writing or reading).

I've been rewriting the library to use IO.select() in place of all of
the old Timeout::timeout() statements. I'm finding that its behavior is
really weird, though.

Here is my snippet of modified Net::TFTP.putbinaryfile:

     def putbinary(remotefile, io,&block) # :yields: data, seq
       s = UDPSocket.new
       peer_ip = IPSocket.getaddress(@host)
       puts "putting binary file to ", peer_ip
       peer_tid = nil
       seq = 0
       from = nil
       data = nil
       while TRUE do
         s.send(wrq_packet(remotefile, "octet"), 0, peer_ip, @port)
         puts "(Re-)Sent fwrite request, waiting for response"
         a = IO.select([s], nil, nil, 1)
         if a
           puts "."
           packet, from = s.recvfrom(2048,0)
           puts "."
           puts "received packet " , packet, " from ", from
           next unless peer_ip == from[3]
           type, block, data = scan_packet(packet)
           break if (type == OP_ERROR) || (type == OP_OPACK) || ((type ==
OP_ACK)&& (block == seq))
         end
       end

My output is:
<I see the write request packet go out via wireshark>
<0.04 seconds later, I see an ACK from the tftp server>
Sent fwrite request, waiting for response
.
<The program then pauses for 5 seconds>
.
received packet....yadda yadda yadda

The same delay occurs when I actually send packets of the file (and if
the file is big, the 5-second delays between blocks is horrifyingly
slow).

It seems that recvfrom() still blocks, even though IO.select should not
return a non-nil response unless there is data to be read on the socket?
Setting different maximum sizes to recvfrom() doesn't change the
behavior.

I wrote a separate program doing the exact same thing with IO.select()
followed by recvfrom(). It then s.send()'s the data back to the client,
basically making a UDP echo service. Connecting to that with netcat
yields exactly what I'd expect: the data is echo'd back to netcat
immediately, not after a several-second delay.

If it's any help, I'm running Ruby 1.8.7...

I'd appreciate any help that folks can provide, even if it's just a
pointer to some other documentation that I should read. From what I've
read, IO.select() should do what I want, though?

Thanks,
Reid

Sam Duncan wrote in post #1036445:

What happens if you set non-blocking on the socket and try to catch an
exception on the recv?

require 'fcntl'
...

s = UDPSocket.new

s.fcntl(Fcntl::F_SETFL, s.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
...
begin

   packet, from = s.recvfrom(2048,0)

rescue StandardError => e
   puts "Waarg: #{e}"
   next
end
...

Hi Sam -

Thanks for the help.

I am confused, because my program still blocks on the call to
s.recvfrom(). I tossed a puts before and after the 's.recvfrom()' call
and sure enough, it is still pausing for 5 seconds inside of the try
block. This in spite of the O_NONBLOCK. And for sanity I did a few
other printy things to be sure that my program is using the modified
version of the library.

Is there any known goofiness to ruby 1.8.7's IO not working correctly?
The behavior that I'm seeing seems really...odd.

Thanks again,
Reid

···

--
Posted via http://www.ruby-forum.com/\.

And you said that changing the recv buffer size (2**16) made no difference? How about specifying Socket::MSG_TRUNC in the flags to see if the whole datagram didn't make it (man recv(2)/ recvfrom(2))? Or maybe try reading the header (I read it is four bytes) to find the data length and then a subsequent read for the rest based on that? Sorry, I'm clutching at straws, but I'm curious to see what you find. I found some old forum posts with people discussing non-blocking UPD sockets in Ruby, but they gave up and used event machine before the issue was resolved which is a shame.

Looking at s_recvfrom and s_recvfrom_nonblock in socket.c the main difference appears to be that the non-block version sets MSG_DONTWAIT if defined, and then uses rb_io_set_nonblock from io.c to do the fcntl shuffle. The blocking version has a retry loop which sits in rb_io_wait_readable on the socket fd. I would definitely use the nonblock one when fiddling around just to rule out ever getting into that rb_io_wait_readable which is just another select.

For what it is worth, neither of those functions appear to have changed between ruby-1.8.6-p420, ruby-1.8.7-p352, and ruby-1.9.3-p0.

Sam

···

On 14/12/11 05:20, Reid Wightman wrote:

Sam Duncan wrote in post #1036445:

What happens if you set non-blocking on the socket and try to catch an
exception on the recv?

require 'fcntl'
...

s = UDPSocket.new

s.fcntl(Fcntl::F_SETFL, s.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
...
begin

    packet, from = s.recvfrom(2048,0)

rescue StandardError => e
    puts "Waarg: #{e}"
    next
end
...

Hi Sam -

Thanks for the help.

I am confused, because my program still blocks on the call to
s.recvfrom(). I tossed a puts before and after the 's.recvfrom()' call
and sure enough, it is still pausing for 5 seconds inside of the try
block. This in spite of the O_NONBLOCK. And for sanity I did a few
other printy things to be sure that my program is using the modified
version of the library.

Is there any known goofiness to ruby 1.8.7's IO not working correctly?
The behavior that I'm seeing seems really...odd.

Thanks again,
Reid