Net-ssh - execution expired

hi *,

i'm having a problem with net-ssh. i'm trying to run a command on
another server that might take about 10h. after about 4 to 5 hours i ran
into the error below.
is there a way to modify the timeout that might cause the problem?

i know this is the ruby mailing list and i'm working on a rails app but
this problem is more related to pure ruby than rails.

it would be great if somebody could give me some pointers.

thx for your help,
simon

#<Class:0x000000043ba6b0>: execution expired
/usr/local/lib/ruby/gems/1.9.1/gems/net-ssh-2.0.23/lib/net/ssh/ruby_compat.rb:22:in
`select'
/usr/local/lib/ruby/gems/1.9.1/gems/net-ssh-2.0.23/lib/net/ssh/ruby_compat.rb:22:in
`io_select'
/usr/local/lib/ruby/gems/1.9.1/gems/net-ssh-2.0.23/lib/net/ssh/connection/session.rb:201:in
`process'
/usr/local/lib/ruby/gems/1.9.1/gems/net-ssh-2.0.23/lib/net/ssh/connection/session.rb:161:in
`block in loop'
/usr/local/lib/ruby/gems/1.9.1/gems/net-ssh-2.0.23/lib/net/ssh/connection/session.rb:161:in
`loop'
/usr/local/lib/ruby/gems/1.9.1/gems/net-ssh-2.0.23/lib/net/ssh/connection/session.rb:161:in
`loop'
/usr/local/lib/ruby/gems/1.9.1/gems/net-ssh-2.0.23/lib/net/ssh/connection/channel.rb:269:in
`wait'
/var/rails/releases/20110314075616/lib/ssh_commander.rb:119:in `block (2
levels) in runCmd'
/var/rails/releases/20110314075616/lib/ssh_commander.rb:96:in `each'
/var/rails/releases/20110314075616/lib/ssh_commander.rb:96:in `block in
runCmd'
/usr/local/lib/ruby/gems/1.9.1/gems/net-ssh-2.0.23/lib/net/ssh.rb:186:in
`start'
/var/rails/releases/20110314075616/lib/ssh_commander.rb:88:in `runCmd'
/var/rails/releases/20110314075616/app/models/task.rb:292:in
`command_run'
/var/rails/releases/20110314075616/app/models/task_mysql_slave.rb:65:in
`block (2 levels) in <class:TaskMysqlSlave>'
/var/rails/releases/20110314075616/app/models/task.rb:341:in `call'
/var/rails/releases/20110314075616/app/models/task.rb:341:in
`command_log_block'
/var/rails/releases/20110314075616/app/models/task_mysql_slave.rb:59:in
`block in <class:TaskMysqlSlave>'
/var/rails/releases/20110314075616/app/models/task.rb:154:in
`instance_exec'
/var/rails/releases/20110314075616/app/models/task.rb:154:in `exec'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/performable_method.rb:20:in
`perform'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/backend/base.rb:83:in
`invoke_job'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/worker.rb:119:in
`block (2 levels) in run'
/usr/local/lib/ruby/1.9.1/timeout.rb:57:in `timeout'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/worker.rb:119:in
`block in run'
/usr/local/lib/ruby/1.9.1/benchmark.rb:309:in `realtime'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/worker.rb:118:in
`run'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/worker.rb:176:in
`reserve_and_run_one_job'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/worker.rb:103:in
`block in work_off'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/worker.rb:102:in
`times'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/worker.rb:102:in
`work_off'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/worker.rb:77:in
`block (2 levels) in start'
/usr/local/lib/ruby/1.9.1/benchmark.rb:309:in `realtime'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/worker.rb:76:in
`block in start'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/worker.rb:73:in
`loop'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/worker.rb:73:in
`start'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/command.rb:100:in
`run'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/command.rb:79:in
`block in run_process'
/usr/local/lib/ruby/gems/1.9.1/gems/daemons-1.0.10/lib/daemons/application.rb:215:in
`call'
/usr/local/lib/ruby/gems/1.9.1/gems/daemons-1.0.10/lib/daemons/application.rb:215:in
`block in start_proc'
/usr/local/lib/ruby/gems/1.9.1/gems/daemons-1.0.10/lib/daemons/daemonize.rb:192:in
`call'
/usr/local/lib/ruby/gems/1.9.1/gems/daemons-1.0.10/lib/daemons/daemonize.rb:192:in
`call_as_daemon'
/usr/local/lib/ruby/gems/1.9.1/gems/daemons-1.0.10/lib/daemons/application.rb:219:in
`start_proc'
/usr/local/lib/ruby/gems/1.9.1/gems/daemons-1.0.10/lib/daemons/application.rb:255:in
`start'
/usr/local/lib/ruby/gems/1.9.1/gems/daemons-1.0.10/lib/daemons/controller.rb:69:in
`run'
/usr/local/lib/ruby/gems/1.9.1/gems/daemons-1.0.10/lib/daemons.rb:188:in
`block in run_proc'
/usr/local/lib/ruby/gems/1.9.1/gems/daemons-1.0.10/lib/daemons/cmdline.rb:105:in
`call'
/usr/local/lib/ruby/gems/1.9.1/gems/daemons-1.0.10/lib/daemons/cmdline.rb:105:in
`catch_exceptions'
/usr/local/lib/ruby/gems/1.9.1/gems/daemons-1.0.10/lib/daemons.rb:187:in
`run_proc'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/command.rb:78:in
`run_process'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/command.rb:72:in
`block in daemonize'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/command.rb:70:in
`times'
/usr/local/lib/ruby/gems/1.9.1/gems/delayed_job-2.1.2/lib/delayed/command.rb:70:in
`daemonize'
/var/rails/current/script//delayed_job:5:in `<main>'

···

--
Posted via http://www.ruby-forum.com/.

Hey,

it seems I am getting the same kind of issue right now that seems to
happen randomly where one of my ssh.exec! call get stuck.

I ran with debug verbose and collected the following logs:

I, [2011-07-21T21:08:41.387471 #17018] INFO --
net.ssh.connection.session[54acbcc]: channel_open_confirmation: 5 1 0
32768
I, [2011-07-21T21:08:41.387596 #17018] INFO --
net.ssh.connection.channel[53efa28]: sending channel request "exec"
D, [2011-07-21T21:08:41.387805 #17018] DEBUG -- tcpsocket[5505212]:
queueing packet nr 22 type 98 len 44
D, [2011-07-21T21:08:41.387947 #17018] DEBUG -- tcpsocket[5505212]: sent
68 bytes
D, [2011-07-21T21:08:41.391034 #17018] DEBUG -- tcpsocket[5505212]: read
88 bytes
D, [2011-07-21T21:08:41.391278 #17018] DEBUG -- tcpsocket[5505212]:
received packet nr 39 type 93 len 28
I, [2011-07-21T21:08:41.391397 #17018] INFO --
net.ssh.connection.session[54acbcc]: channel_window_adjust: 5 +2097152
D, [2011-07-21T21:08:41.391558 #17018] DEBUG -- tcpsocket[5505212]:
received packet nr 40 type 99 len 12
I, [2011-07-21T21:08:41.391650 #17018] INFO --
net.ssh.connection.session[54acbcc]: channel_success: 5
D, [2011-07-21T21:08:41.418998 #17018] DEBUG -- tcpsocket[5505212]: read
36 bytes
D, [2011-07-21T21:08:41.419205 #17018] DEBUG -- tcpsocket[5505212]:
received packet nr 41 type 96 len 12
I, [2011-07-21T21:08:41.419302 #17018] INFO --
net.ssh.connection.session[54acbcc]: channel_eof: 4
D, [2011-07-21T21:08:41.419501 #17018] DEBUG -- tcpsocket[5505212]: read
104 bytes
D, [2011-07-21T21:08:41.419699 #17018] DEBUG -- tcpsocket[5505212]:
received packet nr 42 type 98 len 44
I, [2011-07-21T21:08:41.419825 #17018] INFO --
net.ssh.connection.session[54acbcc]: channel_request: 4 exit-signal
false
D, [2011-07-21T21:08:41.420041 #17018] DEBUG -- tcpsocket[5505212]:
received packet nr 43 type 97 len 12
I, [2011-07-21T21:08:41.420133 #17018] INFO --
net.ssh.connection.session[54acbcc]: channel_close: 4
D, [2011-07-21T21:08:41.420344 #17018] DEBUG -- tcpsocket[5505212]:
queueing packet nr 23 type 97 len 28
D, [2011-07-21T21:08:41.420464 #17018] DEBUG -- tcpsocket[5505212]: read
140 bytes
D, [2011-07-21T21:08:41.420620 #17018] DEBUG -- tcpsocket[5505212]:
received packet nr 44 type 96 len 12
I, [2011-07-21T21:08:41.420711 #17018] INFO --
net.ssh.connection.session[54acbcc]: channel_eof: 5
D, [2011-07-21T21:08:41.420886 #17018] DEBUG -- tcpsocket[5505212]:
received packet nr 45 type 98 len 44
I, [2011-07-21T21:08:41.421000 #17018] INFO --
net.ssh.connection.session[54acbcc]: channel_request: 5 exit-status
false
D, [2011-07-21T21:08:41.421197 #17018] DEBUG -- tcpsocket[5505212]:
received packet nr 46 type 97 len 12
I, [2011-07-21T21:08:41.421289 #17018] INFO --
net.ssh.connection.session[54acbcc]: channel_close: 5
D, [2011-07-21T21:08:41.421444 #17018] DEBUG -- tcpsocket[5505212]:
queueing packet nr 24 type 97 len 28

I don't see anything obviously wrong. The backtrace when I kill the app
is the following:

~/.gem/ruby/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/ruby_compat.rb:22:in
`select': Interrupt
        from
~/.gem/ruby/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/ruby_compat.rb:22:in
`io_select'
        from
~/.gem/ruby/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/session.rb:201:in
`process'
        from
~/.gem/ruby/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/session.rb:161:in
`block in loop'
        from
~/.gem/ruby/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/session.rb:161:in
`loop'
        from
~/.gem/ruby/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/session.rb:161:in
`loop'
        from
~/.gem/ruby/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/channel.rb:269:in
`wait'
        from
~/.gem/ruby/1.9.1/gems/net-ssh-2.1.4/lib/net/ssh/connection/session.rb:355:in
`exec!'

The code is the following:

  mcast.each do |gaddr|
    STDERR.print " [+] testing filtering for multicast address
(#{gaddr})"

    thleafiperf = Thread.new do
      ch = @sshleafvm.open_channel do |ch|
        ch.exec("iperf -s -u -B #{gaddr}") do |ch, success|
          raise "failed to launch iperf on multicast leaf vm" unless
success
        end
      end
      ch.wait
    end

    thleaftcpdump = Thread.new do
      ch = @sshleafvm.open_channel do |ch|
        ch[:result]

        ch.exec("tcpdump -i eth0 src host #{@vmrouterip} and dst host
#{gaddr} -c 1 -n -N") do |ch, success|
          raise "failed to launch tcpdump on multicast leaf vm" unless
success
        end

        ch.on_data do |ch, data|
          ch[:result] << data
        end
      end
      ch.wait
      ch[:result]
    end

    throuter = Thread.new do
      ch = @sshroutervm.open_channel do |ch|
        ch.exec("iperf -c #{gaddr} -u -T 32 -t 20 -i 1") do |ch,
success>
          raise "failed to launch iperf on multicast router vm" unless
success
        end
      end
      ch.wait
    end

    (1..60).each do
      sleep 1
      STDERR.print "."
    end
    puts ""

    puts " [+] multicast leaf vm captured: #{thleaftcpdump.value}"
unless thleaftcpdump.alive?
    puts " [+] multicast leaf vm did not capture anything" unless
!thleaftcpdump.alive?
    passed = (acl == "white-list") ? !thleaftcpdump.alive? :
thleaftcpdump.alive?

    if passed
      puts " [+] multicast address (#{gaddr}) properly filtered"
    else
      puts " [-] multicast address (#{gaddr}) filtering failed"
    end

    @sshleafvm.exec!("killall -9 iperf")
    @sshleafvm.exec!("killall -9 tcpdump")
    @sshroutervm.exec!("killall -9 iperf")
  end

It seems to get stuck around the killall commands after couple of
iterations. I did not let it run long enough but I am sure on the long
run I would get the execution expired issue mentioned above.

Thanks for the help,

···

--
Posted via http://www.ruby-forum.com/.