[PATCH] Long-Standing Bug in the Readline extension

The Readline uses the the GNU readline library's event handling hook
to try to ensure that other ruby threads do not block while it is awaiting user keystrokes from STDIN. I assert that the use of the event hook for this purpose is unnecessary and, furthermore, the current implementation can set the Thread.critical flag in other threads.

The current threadling_event() hook function is:

static int
readline_event()
{
   CHECK_INTS;
   rb_thread_schedule(); //bug here if Thread.critical !!!
   return 0;
}

If Thread.critical is set when readline is invoked, readline_event() will still schedule any other thread that becomes ready. This can
cause another thread, that was not in a critical section, to interrupt one that was. So, now we have a thread becoming "critical" asynchronously. This leads to some very weird lockups -- some of which have been reported on this mailing list.

I think a quick fix might be to remove the unconditional rb_thread_schedule() call from readline_event. But, a better
fix is to remove the readline_event() function entirely.

So, how will we avoid blocking other threads during keyboard reads?
Replace readline_event() with this:

static int
readline_getc (ignored)
   FILE *ignored;
{
   VALUE string = rb_funcall (rb_stdin, rb_intern("sysread"),
            1, INT2FIX(1));
   return RSTRING(string)->ptr[0]; //single byte read
}

We redefine the function that GNU readline uses to read the input stream
to be one based on Ruby's IO#sysread and let Ruby's own IO class
deal with all its threading nastyness.

This has been tested (fairly extensively) in our own multi-threading
motion control application under Ruby 1.6.8. The same bug appears
to exist in the 1.8.x series, although I have not tried
this patch against 1.8.x. I'm hoping someone on the list whose
running 1.8.x (everyone?) will test this.

The changes to ext/readline/readline.c are confined to the first few
lines. Everything from the readline_readline() fn on remains unchanged:

/* readline.c -- GNU Readline module
    Copyright (C) 1997-1998 Shugo Maeda */

#include <errno.h>
#include <stdio.h>
#include <readline/readline.h>
#include <readline/history.h>

#include "ruby.h"
#include "rubysig.h"
#include "rubyio.h"

static VALUE mReadline;

#define TOLOWER(c) (isupper(c) ? tolower(c) : c)

#define COMPLETION_PROC "completion_proc"
#define COMPLETION_CASE_FOLD "completion_case_fold"

#ifndef READLINE_42_OR_LATER
# define rl_filename_completion_function filename_completion_function
# define rl_username_completion_function username_completion_function
# define rl_completion_matches completion_matches
#endif

static int
readline_getc (ignored)
   FILE *ignored;
{
   VALUE string = rb_funcall (rb_stdin, rb_intern("sysread"), 1, INT2FIX(1));
   return RSTRING(string)->ptr[0]; //single byte read
}

···

--
  Brent Roman
  mailto:brent@mbari.org http://www.mbari.org/~brent

Hi,

At Mon, 27 Dec 2004 09:30:16 +0900,
Brent Roman wrote in [ruby-talk:124508]:

If Thread.critical is set when readline is invoked, readline_event()
will still schedule any other thread that becomes ready. This can
cause another thread, that was not in a critical section, to interrupt
one that was. So, now we have a thread becoming "critical"
asynchronously. This leads to some very weird lockups -- some of which
have been reported on this mailing list.

It would have to return immediately if rb_thread_critical is
set.

I think a quick fix might be to remove the unconditional
rb_thread_schedule() call from readline_event. But, a better
fix is to remove the readline_event() function entirely.

So, how will we avoid blocking other threads during keyboard reads?
Replace readline_event() with this:

static int
readline_getc (ignored)
   FILE *ignored;
{
   VALUE string = rb_funcall (rb_stdin, rb_intern("sysread"),
            1, INT2FIX(1));
   return RSTRING(string)->ptr[0]; //single byte read
}

We redefine the function that GNU readline uses to read the input stream
to be one based on Ruby's IO#sysread and let Ruby's own IO class
deal with all its threading nastyness.

Do you mean to set it to rl_getc_function?

···

--
Nobu Nakada

Hi,

... So, now we have a thread becoming "critical" asynchronously. This leads to some very weird lockups -- some of which have been reported on this mailing list.

It would have to return immediately if rb_thread_critical is
set.

I'm not sure what "it" refers to here.

If rb_thread_critical is set at the readline_getc() function
no other threads can be scheduled during the IO read.
This is usually a "bad thing", but I have used it to stop
scheduling for debugging, as in:

Thread.new {$a=0; loop{sleep .1; $a+=1}}; sleep 1; Thread.critical=true; breakpoint

The "new" thread will pause while processing the breakpoint in irb.
It works as expected with the patches I just submitted
ruby-talk "Fixes for rb_thread_critical bugs"

So, I think that rb_thread_critical should be not be affected by
readline. It's up to the caller to maintain it.

...
We redefine the function that GNU readline uses to read the input stream
to be one based on Ruby's IO#sysread and let Ruby's own IO class
deal with all its threading nastyness.

Do you mean to set it to rl_getc_function?

Yes. Good catch.

There is one line inserted near the end of Init_readline():

...
     rl_attempted_completion_function
  = (CPPFunction *) readline_attempted_completion_function;
     rl_getc_function = readline_getc; //this is the rl_getc_function
     rl_clear_signals();
}

My original patch missed this. Sorry.

···

--
  Brent Roman

Brent Roman wrote:

> under Ruby 1.6.8
The "new" thread will pause while processing the breakpoint in irb.
It works as expected with the patches I just submitted
ruby-talk "Fixes for rb_thread_critical bugs"

Sorry to be off-topic, but this got me interested. Is breakpoint.rb working correctly on 1.6.8?

Thank you.

I don't know if it's working "correctly", but it is functional.
I liked the breakpoint idea so much that I distilled it
into the following for 1.6.8:

   module IRB
   #normal usage is:
   # breakpoint binding
   #drops into another IRB session with the caller's binding
   #exit (optional return value) to resume the broken thread
     def IRB.breakpoint(workspace = nil, name="breakpoint")
       puts "At #{name}:"
       workspace = IRB::WorkSpace.new workspace if
    workspace.is_a? Binding
       @CONF[:MAIN_CONTEXT] = (irb = IRB::Irb.new(workspace)).context

       oldTrap = trap 'INT' do
         irb.signal_handle
       end
       result = catch :IRB_EXIT do
         begin
           irb.eval_input
         rescue Exception
           print $!.type, ": ", $!, "\n"
           retry
         end
       end
       trap ('INT', nil, &oldTrap)
       result
     end
   end
   def breakpoint (*args)
     IRB.breakpoint(*args)
   end

The main limitation of this hack verses the real one is that the caller
must explicitly pass the binding into the breakpoint. There's no
"bindingOfCaller". If you'd like to take a crack at that, I'd be most
grateful.

···

--
  Brent Roman
  mailto:brent@mbari.org http://www.mbari.org/~brent

Brent Roman wrote:

I don't know if it's working "correctly", but it is functional.

Ah, that's good to know. :slight_smile:

I liked the breakpoint idea so much that I distilled it
into the following for 1.6.8:
[snip]
The main limitation of this hack verses the real one is that the caller
must explicitly pass the binding into the breakpoint. There's no
"bindingOfCaller". If you'd like to take a crack at that, I'd be most
grateful.

Not sure what you want there -- Binding.of_caller is already available as a library and should be easy to drop in. Change breakpoint() to this and it ought to work:

def breakpoint(name = "breakpoint", context = nil)
   return IRB.breakpoint(context, name) if context
   Binding.of_caller do |context|
     IRB.breakpoint(context, name)
   end
end

But maybe I am misunderstanding.

Thanks Florian:

I extracted continuation.rb and binding.rb from the (ponderous?)
extensions framework. Worked on the first pull!

I also swapped the order of the arguments to IRB.breakpoint
now that it's much less likely that anyone will specify a
workspace.

Thanks again,

···

--
  Brent Roman
  mailto:brent@mbari.org http://www.mbari.org/~brent