Replacing the use of gettimeofday in the scheduler

using gettimeofday in the scheduler is problematic, since it's possible that
the system time will jump ahead or back because of the user or ntp resetting the
time [1]. This can have side effects such as sleep() and timeout() never
returning and thus threads not ever being scheduled again and seems to have
also other side effects [3].

Eric Hodel is arguing [2] that replacing the existing mechanism that uses
libc-select to sleep and getimeofday to calculate the effectively elapsed time
by libc-sleep is also problematic because:

"[for libc-sleep]... system activity may lengthen the sleep by an indeterminate
amount."

However, this applies in exactly the same way to libc-select as well and thus
replacing the select/gettimeofday mechanism by libc-sleep should at least work
no worse. Objections?

Has there been any effort to implement a solution based on sleep/usleep? Is the
interest to implement a more robust schedule timing mechanism? Is there a
chance for a patch based on sleep/usleep to make it into CVS?
*t

[1] http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/103140
[2] http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/103245
[3] http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/229829

···

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

My first reaction was: good god, the scheduler uses wallclock time?!
Speaking as someone who works on realtime systems (and thus has to
think about scheduler implementation often), this is never a good
idea. I don't know the background for Ruby's scheduler design, but
normally I'd regard a scheduler which uses wallclock time as just
plain *broken*. I'm heartily in favor of changing it to something
which isn't dependent on the clock.

That "indeterminate amount" referenced above is simply the price you
pay for running in userspace ion a modern multitasking OS. Yes,
system activity could delay the return. That's what multitasking
means: you don't get to choose when you get the CPU. In practice, if
applications are experiencing unacceptable latency in OS scheduling
then 1) your gettimeofday()-based implementation is going to be
delayed right along with everything else; and 2) you have bigger
problems, because your system is overloaded.

Cheers,

···

On 3/1/07, Tomas Pospisek <tpo2@sourcepole.ch> wrote:

However, this applies in exactly the same way to libc-select as well and thus
replacing the select/gettimeofday mechanism by libc-sleep should at least work
no worse. Objections?

--
Avdi

Quoting Tomas Pospisek <tpo2@sourcepole.ch>:

Has there been any effort to implement a [scheduling] solution based on
sleep/usleep? Is the interest to implement a more robust schedule timing
mechanism? Is there a chance for a patch based on sleep/usleep to make it
into CVS?

gnu-libc's sleep(3) manpage suggest that sleep and SIGALRM on non-glibc systems
don't get along. From the POSIX spec [1]:

    "If a SIGALRM signal is generated for the calling process during execution
     of sleep(), except as a result of a prior call to alarm(), and if the
     SIGALRM signal is not being ignored or blocked from delivery, it is
     unspecified whether that signal has any effect other than causing sleep()
     to return."

( Thus it is possible that Ruby's signalhandler for SIGALRM will *not* be
  executed )

Since Ruby *does* allow the user to handle SIGALRM that would mean that an
implementation based on libc-sleep would fail to work correctly on the above
described systems, that don't handle SIGALRM together with sleep gracefully,
when the user is doing stuff with SIGALRM.

Does anybody know how relevant that is? I.e. does Ruby run at all on such
systems? The above would seem to exclude to implement scheduler waiting with
libc-sleep since that would prevent correct functioning of Ruby on such systems
in "corner cases".

?
*t

[1] sleep

···

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

However, this applies in exactly the same way to libc-select as well and thus
replacing the select/gettimeofday mechanism by libc-sleep should at least work
no worse. Objections?

My first reaction was: good god, the scheduler uses wallclock time?!
Speaking as someone who works on realtime systems (and thus has to
think about scheduler implementation often), this is never a good
idea. I don't know the background for Ruby's scheduler design, but
normally I'd regard a scheduler which uses wallclock time as just
plain *broken*. I'm heartily in favor of changing it to something
which isn't dependent on the clock.

The Ruby thread scheduler uses setitimer(2) and select(2). It depends on the wall-clock for implementing features defined in terms of the wall-clock (Kernel#sleep and Thread#join).

That "indeterminate amount" referenced above is simply the price you
pay for running in userspace ion a modern multitasking OS. Yes,
system activity could delay the return. That's what multitasking
means: you don't get to choose when you get the CPU. In practice, if
applications are experiencing unacceptable latency in OS scheduling
then 1) your gettimeofday()-based implementation is going to be
delayed right along with everything else; and 2) you have bigger
problems, because your system is overloaded.

Kernel#sleep behaves differently in Ruby programs using threads. If you sleep in a thread you end up context switching to other threads instead of calling sleep(3).

Since you aren't using sleep(3) in threaded mode, Ruby instead uses gettimeofday(2) to implement Kernel#sleep for the calling thread (has this thread slept its N seconds?), so you may sleep longer than you expect.

The other place gettimeofday(2) is used is Thread#join's timeout, for similar reason.

···

On Mar 1, 2007, at 07:12, Avdi Grimm wrote:

On 3/1/07, Tomas Pospisek <tpo2@sourcepole.ch> wrote:

Do you know of any cross-platform aka POSIX call where the returned time is strictly increasing? How do other green thread implementations handle this problem?
*t

···

On Fri, 2 Mar 2007, Avdi Grimm wrote:

On 3/1/07, Tomas Pospisek <tpo2@sourcepole.ch> wrote:

However, this applies in exactly the same way to libc-select as well and thus replacing the select/gettimeofday mechanism by libc-sleep should at least work no worse. Objections?

My first reaction was: good god, the scheduler uses wallclock time?!
Speaking as someone who works on realtime systems (and thus has to
think about scheduler implementation often), this is never a good
idea. I don't know the background for Ruby's scheduler design, but
normally I'd regard a scheduler which uses wallclock time as just
plain *broken*. I'm heartily in favor of changing it to something
which isn't dependent on the clock.

--
-----------------------------------------------------------
   Tomas Pospisek
   http://sourcepole.com - Linux & Open Source Solutions
-----------------------------------------------------------

<snip>

Thanks for the explanation. I'm probably missing something, I'm
confused by why the functionality you describe in Kernel#sleep and
Thread#join can't be implemented using only select(). Can you clarify?

Thanks,

···

On 3/1/07, Eric Hodel <drbrain@segment7.net> wrote:

The Ruby thread scheduler uses setitimer(2) and select(2). It
depends on the wall-clock for implementing features defined in terms
of the wall-clock (Kernel#sleep and Thread#join).

--
Avdi

However, this applies in exactly the same way to libc-select as well and thus replacing the select/gettimeofday mechanism by libc-sleep should at least work no worse. Objections?

My first reaction was: good god, the scheduler uses wallclock time?!
Speaking as someone who works on realtime systems (and thus has to
think about scheduler implementation often), this is never a good
idea. I don't know the background for Ruby's scheduler design, but
normally I'd regard a scheduler which uses wallclock time as just
plain *broken*. I'm heartily in favor of changing it to something
which isn't dependent on the clock.

The Ruby thread scheduler uses setitimer(2) and select(2). It depends on the wall-clock for implementing features defined in terms of the wall-clock (Kernel#sleep and Thread#join).

You need to add Timeout#timeout to this. But:

$ ri Kernel#sleep

      Suspends the current thread for _duration_ seconds (which may be
      any number, including a +Float+ with fractional seconds). Returns
      the actual number of seconds slept (rounded), which may be less
      than that asked for if another thread calls +Thread#run+. Zero
      arguments causes +sleep+ to sleep forever.

No reference to wall-clock in there. What do you mean by "defined in terms of the wall-clock"?

That "indeterminate amount" referenced above is simply the price you
pay for running in userspace ion a modern multitasking OS. Yes,
system activity could delay the return. That's what multitasking
means: you don't get to choose when you get the CPU. In practice, if
applications are experiencing unacceptable latency in OS scheduling
then 1) your gettimeofday()-based implementation is going to be
delayed right along with everything else; and 2) you have bigger
problems, because your system is overloaded.

Kernel#sleep behaves differently in Ruby programs using threads. If you sleep in a thread you end up context switching to other threads instead of calling sleep(3).

Since you aren't using sleep(3) in threaded mode, Ruby instead uses gettimeofday(2) to implement Kernel#sleep for the calling thread (has this thread slept its N seconds?), so you may sleep longer than you expect.

The other place gettimeofday(2) is used is Thread#join's timeout, for similar reason.

The problem is that when you set system time into the past by a month, then your thread will also sleep for a month and not, as you probably expected, only a few seconds. Which is actually the hint for the solution... to be followed.

*t

···

On Fri, 2 Mar 2007, Eric Hodel wrote:

On Mar 1, 2007, at 07:12, Avdi Grimm wrote:

On 3/1/07, Tomas Pospisek <tpo2@sourcepole.ch> wrote:

--
-----------------------------------------------------------
   Tomas Pospisek
   http://sourcepole.com - Linux & Open Source Solutions
-----------------------------------------------------------

It *is* implemented using select, but select is, per spec, allowed to return before the time's up. So rb_thread_wait_for(time) is (indirectly) using gettimeofday to find out how much time has gone by. And if

(diff = (gettimeofday_now - gettimeofday_before_we_called_select) > 0 )

then rb_thread_wait_for reiterates and sleeps (with select) again.

I can see the following solutions:

    * find a reliable time source that works cross-platform. uptime and
      ticks would be candidates, but I haven't found a way to have them
      cross-platform.
    * use thread_timer as a reliable time source
  *t

···

On Fri, 2 Mar 2007, Avdi Grimm wrote:

On 3/1/07, Eric Hodel <drbrain@segment7.net> wrote:

The Ruby thread scheduler uses setitimer(2) and select(2). It
depends on the wall-clock for implementing features defined in terms
of the wall-clock (Kernel#sleep and Thread#join).

<snip>

Thanks for the explanation. I'm probably missing something, I'm
confused by why the functionality you describe in Kernel#sleep and
Thread#join can't be implemented using only select(). Can you clarify?

  --
-----------------------------------------------------------
   Tomas Pospisek
   http://sourcepole.com - Linux & Open Source Solutions
-----------------------------------------------------------

"wall-clock" refers to real elapsed time, rather than CPU elapsed time. It's better to base your scheduler on CPU elapsed time, since on a heavily loaded system, a "wall-clock"-based scheduler will just thrash without getting much useful work done.

Since there aren't widespread standard APIs for CPU-time-based interrupts, most runtimes with "green thread" schedulers that are based on CPU time approximate it by counting reductions, VM instructions, or AST nodes traversed.

-mental

···

On Fri, 2 Mar 2007 06:30:12 +0900, "Tomas Pospisek's Mailing Lists" <tpo2@sourcepole.ch> wrote:

What do you mean by "defined in terms of the wall-clock"?

The Ruby thread scheduler uses setitimer(2) and select(2). It depends on the wall-clock for implementing features defined in terms of the wall-clock (Kernel#sleep and Thread#join).

You need to add Timeout#timeout to this.

Nope. Timeout calls Kernel#sleep in a thread.

But:

$ ri Kernel#sleep

     Suspends the current thread for _duration_ seconds (which may be
     any number, including a +Float+ with fractional seconds). Returns
     the actual number of seconds slept (rounded), which may be less
     than that asked for if another thread calls +Thread#run+. Zero
     arguments causes +sleep+ to sleep forever.

No reference to wall-clock in there. What do you mean by "defined in terms of the wall-clock"?

When I write "sleep 5" I expect at least five seconds on the clock on my wall to go by before the next statement is executed.

···

On Mar 1, 2007, at 13:30, Tomas Pospisek's Mailing Lists wrote:

On Fri, 2 Mar 2007, Eric Hodel wrote:

I don't see how this could be confused for a bug in Ruby.

···

On Mar 1, 2007, at 13:30, Tomas Pospisek's Mailing Lists wrote:

The problem is that when you set system time into the past by a month, then your thread will also sleep for a month and not, as you probably expected, only a few seconds. Which is actually the hint for the solution... to be followed.

Sounds like a bug to me. Time updates happen, sometimes without user
intervention or knowledge. Software which can glitch or hang up as a
result of this fact isn't robust.

···

On 3/1/07, Eric Hodel <drbrain@segment7.net> wrote:

On Mar 1, 2007, at 13:30, Tomas Pospisek's Mailing Lists wrote:

> The problem is that when you set system time into the past by a
> month, then your thread will also sleep for a month and not, as you
> probably expected, only a few seconds. Which is actually the hint
> for the solution... to be followed.

I don't see how this could be confused for a bug in Ruby.

--
Avdi

The Ruby thread scheduler uses setitimer(2) and select(2). It depends on the wall-clock for implementing features defined in terms of the wall-clock (Kernel#sleep and Thread#join).

You need to add Timeout#timeout to this.

Nope. Timeout calls Kernel#sleep in a thread.

Um. When I do a "timeout(5) { do_something };" and set the system clock back by a minute, then the timeout will not *ever* time out.

But:

$ ri Kernel#sleep

    Suspends the current thread for _duration_ seconds (which may be
    any number, including a +Float+ with fractional seconds). Returns
    the actual number of seconds slept (rounded), which may be less
    than that asked for if another thread calls +Thread#run+. Zero
    arguments causes +sleep+ to sleep forever.

No reference to wall-clock in there. What do you mean by "defined in terms of the wall-clock"?

When I write "sleep 5" I expect at least five seconds on the clock on my wall to go by before the next statement is executed.

That's right, However this will not happen, when you change the system time while the 5 seconds have not gone by.
*t

···

On Fri, 2 Mar 2007, Eric Hodel wrote:

On Mar 1, 2007, at 13:30, Tomas Pospisek's Mailing Lists wrote:

On Fri, 2 Mar 2007, Eric Hodel wrote:

--
-----------------------------------------------------------
   Tomas Pospisek
   http://sourcepole.com - Linux & Open Source Solutions
-----------------------------------------------------------

Hi,

···

In message "Re: replacing the use of gettimeofday in the scheduler" on Fri, 2 Mar 2007 07:31:16 +0900, "Avdi Grimm" <avdi@avdi.org> writes:

Sounds like a bug to me. Time updates happen, sometimes without user
intervention or knowledge. Software which can glitch or hang up as a
result of this fact isn't robust.

If it is a bug, I suspect it's a bug in POSIX that doesn't provide any
API for proper "clock" for the purpose. Correct me if I'm wrong.

              matz.

Um. When I do a "timeout(5) { do_something };" and set the system clock back by a minute, then the timeout will not *ever* time out.

Correction, sorry - it will sleep 1 minute + 5 seconds, so...

But:

$ ri Kernel#sleep

    Suspends the current thread for _duration_ seconds (which may be
    any number, including a +Float+ with fractional seconds). Returns
    the actual number of seconds slept (rounded), which may be less
    than that asked for if another thread calls +Thread#run+. Zero
    arguments causes +sleep+ to sleep forever.

No reference to wall-clock in there. What do you mean by "defined in terms of the wall-clock"?

When I write "sleep 5" I expect at least five seconds on the clock on my wall to go by before the next statement is executed.

That's right, However this will not happen, when you change the system time while the 5 seconds have not gone by.

... it will not sleep as long as the wall-clock. I.e. will not do what I expect.
*t

···

On Thu, 1 Mar 2007, Tomas Pospisek's Mailing Lists wrote:

--
-----------------------------------------------------------
   Tomas Pospisek
   http://sourcepole.com - Linux & Open Source Solutions
-----------------------------------------------------------

This might well be. Not being a contributor to the Ruby kernel, I
don't know what the policy is: does Ruby only implement features which
can be built with pure POSIX, or can they have OS-specific
implementations?

···

On 3/1/07, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

If it is a bug, I suspect it's a bug in POSIX that doesn't provide any
API for proper "clock" for the purpose. Correct me if I'm wrong.

--
Avdi

I'd say it's a feature that's missing in POSIX. However you can hack around the fact by using system specific calls (on Linux f.ex. /proc/uptime).

What about using Ruby's own "thread_timer" as a more or less accurate time source?
*t

···

On Fri, 2 Mar 2007, Yukihiro Matsumoto wrote:

Hi,

In message "Re: replacing the use of gettimeofday in the scheduler" > on Fri, 2 Mar 2007 07:31:16 +0900, "Avdi Grimm" <avdi@avdi.org> writes:

>Sounds like a bug to me. Time updates happen, sometimes without user
>intervention or knowledge. Software which can glitch or hang up as a
>result of this fact isn't robust.

If it is a bug, I suspect it's a bug in POSIX that doesn't provide any
API for proper "clock" for the purpose. Correct me if I'm wrong.

--
-----------------------------------------------------------
   Tomas Pospisek
   http://sourcepole.com - Linux & Open Source Solutions
-----------------------------------------------------------

HI,

···

In message "Re: replacing the use of gettimeofday in the scheduler" on Fri, 2 Mar 2007 08:07:20 +0900, "Avdi Grimm" <avdi@avdi.org> writes:

This might well be. Not being a contributor to the Ruby kernel, I
don't know what the policy is: does Ruby only implement features which
can be built with pure POSIX, or can they have OS-specific
implementations?

It can have OS-specific implementation, but I want the core behavior
being common on most (if not all) platforms. Besides that, I have no
idea to fix this "bug" on _any_ platform right now. Any idea?

              matz.

It looks like sufficiently recent POSIX standards DO have a solution
for this. I'd like to do some more research on this, but right now I
don't have time. For anyone who wants to take a look at it, here's a
starting point:
http://www.opengroup.org/onlinepubs/009695399/basedefs/time.h.html

Pay particular attention to CLOCK_MONOTONIC and the timer_*() functions.

Hopefully I'll be able to look at this in greater detail over the weekend.

···

On 3/1/07, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

It can have OS-specific implementation, but I want the core behavior
being common on most (if not all) platforms. Besides that, I have no
idea to fix this "bug" on _any_ platform right now. Any idea?

--
Avdi

Quoting Yukihiro Matsumoto <matz@ruby-lang.org>:

>This might well be. Not being a contributor to the Ruby kernel, I
>don't know what the policy is: does Ruby only implement features which
>can be built with pure POSIX, or can they have OS-specific
>implementations?

It can have OS-specific implementation, but I want the core behavior
being common on most (if not all) platforms. Besides that, I have no
idea to fix this "bug" on _any_ platform right now. Any idea?

So here's what I found out after a bit of research and my proposition for a
solution.

Most languages use native threads to implement multithreading so they do not
have to care about scheduling and blocking by themselves. There do not seem to
be many languages/runtimes that use green threads.

* the Gambc Scheme implementation is regarded as being very high quality
  wrt to its implementation of multithreading/green threads. Allthough it
  goes to great lengths to be portable, it seems to base its scheduling
  decisions on the "wall-clock"
  (gettimeofday and clock_gettime(CLOCK_REALTIME, ...) ), so I expect Gambc
  to suffer from the very same scheduling problems as Ruby.

* Python uses native threads but Stackless Python implements green threads
  scheduling by directly accessing the Pentium's internal clock through the
  RDTCS machine instruction. I have not checked how it implements sleep, i.e.
  whether and how hh:mm:ss.mm is calculated from it. This solution has
  evidently a very high hardcore coolness geek factor but is not very portable.

* The GNU Portable Threads Library is using gettimeofday as well thus...

So after a day or so of research I am realizing the shocking fact - what Matz
saw too - that the core of the problem is base POSIX not providing any
monotonic clock API and aparently everybody's scheduler being at the merci of
some sysadmin issuing a "date -s". If anybody knows any better, then pointers
are wellcome.

So I see three approaches for a solution:

1. eliminate the worst case:

   eval.c has a few places where the timeofday() function is used, almost
   exclusively to do something like the following:

     loop() {
         start: start = timeofday()
                    do_something()
         meanwhile: elapsed_time = timeofday() - start
                    remaining_time = elapsed_time - interval_of_interest
                    if( remaining_time < 0 )
                        break;
                    else
                        # loop again
      }

   the code between start and meanwhile represents here a critical section
   where no one should on a system scale be allowed to mess with system time,
   which timeofday() doesn't guarantee.

   Thus what we *can* do is to at least guarantee that remaining_time
   *never ever* increases:

     if( remaining_time > previous_remaining_time )
        remaining_time = previous_remaining_time;
     # else
        previous_remaining_time = remaining_time;

2. Do it "right":

   Doing it right would require having a monotonic time source, which the
   REALTIME extension of POSIX provides through the
   clock_gettime( CLOCK_MONOTONIC, ... ) function.

   Thus Ruby could schedule correctly on systems that *do* implement the
   POSIX REALTIME extension and use the old "broken" method on the other
   systems or add system specific solutions for those at a later time/as
   needed/submitted.

   Linux and DragonFly BSD do have CLOCK_MONOTONIC but OSX does not seem
   to have it. If people want to check about whether their systems provide it,
   here's a test:

     #include <unistd.h>
     #ifdef _POSIX_MONOTONIC_CLOCK
     main() {
       printf("yes\n");
     }
     #endif

3. use Ruby's own thread_timer as a source or as a
   time_sanity_offset_correction

   However - I'm not sure whether this approach yields reliable results and
   does not additional unnecessary complexity

All solutions however have a semantic side effect: timeofday is being called
from:

   a) the scheduler
   b) from sleep()
   c) indirectly from timeout() through sleep()

guaranteeing that remaining_time never increases is good for
a) the scheduler and c) timeout(), but can break existing programs using
c) sleep(), in case someone was doing somthing along the lines of:

     # need to wake up at noon
     sleep_time = noon() - now()
     sleep( sleep_time )

With the current "broken" semantics, that would work just right, since with
the current implementation sleep() time would increase/decrease in parallel
with the "sysadmin" changing the system time with "date -s" or similar.

Thus the question here is: do we want the scheduler and timeout() to work as
naively expected even in a situation where "wall clock" suddenly changes or do
we want sleep to work correctly in the same situation. Do we want absolute "wall
clock" work right or do we want the relative "stop watch" to
work right?

I'd suggest to apply both solutions from above, that is:

a) eliminate the worst case behaveour, where "remaining_time" is growing with
   the current implementation Ruby has and

b) "do the right thing" and use clock_gettime( CLOCK_MONOTONIC, ... ) instead
   of gettimeofday where available.

Opinions? Shall I try to submit a patch?
*t

[1] clock_getres

···

In message "Re: replacing the use of gettimeofday in the scheduler" > on Fri, 2 Mar 2007 08:07:20 +0900, "Avdi Grimm" <avdi@avdi.org> writes:

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.