What is the reason why Ruby doesn't use native threads...at least on
Windows?
Thanks
···
--
Posted via http://www.ruby-forum.com/.
What is the reason why Ruby doesn't use native threads...at least on
Windows?
Thanks
--
Posted via http://www.ruby-forum.com/.
ReggW wrote:
What is the reason why Ruby doesn't use native threads...at least on Windows?
Green threads are the most portable, as threads differ
from one OS to another, I would think. At least, I'm
sure the Windows model isn't the same as pthreads.
Ruby just doesn't use native threads anywhere. If it did,
it would support them first on Linux, its primary
platform. (No flames, please -- I'm just saying that
Matz develops on Linux, and the Windows port is derived
from that.)
It's probably possible to write some kind of extension
to support native threads, but I would think it's quite
a bit of work.
Hal
ReggW wrote:
> What is the reason why Ruby doesn't use native threads...at least on
> Windows?
>Green threads are the most portable, as threads differ
from one OS to another, I would think. At least, I'm
sure the Windows model isn't the same as pthreads.
Yuck. And I believe Solaris is even another beast.
Ruby just doesn't use native threads anywhere. If it did,
it would support them first on Linux, its primary
platform. (No flames, please -- I'm just saying that
Matz develops on Linux, and the Windows port is derived
from that.)
Fl... Just kidding.
It's probably possible to write some kind of extension
to support native threads, but I would think it's quite
a bit of work.
I don't believe this can be done by an extension alone. Threading is
intertwined with IO operations, uses longjmp etc. IMHO this would
amount to a rewrite of a significant portion of the interpreter. And
that's probably also the reason why it does not happen for Ruby 1.x.
Kind regards
robert
Regarding Solaris: its implementation of threads was what supplied the API
model for Posix threads, so you could say it's as close to the original sin
as anything. Most of the important Unix-like systems support the Posix model
more or less well, but (apart from major defects in some of the
implementations), the key nonportabilities relate to the scheduling
discipline. And the world seems to have arrived at a consensus that the
"typical" scheduling discipline for threads is pre-emptive, so these
differences are no longer that important.
Ironically, the Linux implementation of threads is closest to the one in
Windows, although the APIs couldn't be more different. (Win32 had
kernel-scheduled threads from the earliest beta releases in 1992, at least
three years before the Posix API was standardized.) In both Linux and
Windows, threads are "lightweight processes," relatively heavyweight
entities which are scheduled by the kernel. Ruby's threads (and the threads
in the early Java implementations) are pure userland threads, scheduled by a
library inside your process. (Solaris uses an extremely complex hybrid model
which in my opinion has proven to be far more trouble than it's worth.)
The reason that Ruby's threads are tightly intertwined with the interpreter
logic is because Ruby must prevent the possibility that one of your threads
may make a system call that will block in the kernel (like reading a disk
file or a network socket, accessing the system time, etc) and thus block
every thread in your program. Ruby uses the I/O multiplexer (select) to keep
this from happening.
Threads can be used for two basic purposes: to make your programs run
faster, or to make them easier to write. Ruby's (and Java's) threads seem
designed primarily to facilitate the latter. You can easily imagine several
kinds of problems that are easier to model if you have access to relatively
independent flows of control. Thus both languages have the "synchronize"
method, taking an arbitrary code block, which makes it easy to lock
relatively large chunks of code in "critical sections" without having to
really design proper synchronization sets.
But to effectively use threads for higher performance and concurrency
requires a large amount of experience and understanding, much of which takes
platform dependencies into account. For just one example, I would want to
use a spin lock in some situations, if I'm running on a multi-processor
machine on certain hardware platforms. Ruby doesn't have one.
It seems to me that Ruby's green-thread implementation is perfectly adequate
for most programmers' requirements. What I think might be interesting is an
extension that would provide access to native threads and synchronization
primitives in parallel with Ruby's (an early version of the EventMachine
library did this). Then you could write extensions that were far more
thread-hot than is possible with Ruby threads. It may be possible to do this
without disturbing the existing implementation. If you wanted to mix Ruby
threads with native threads, you'd just have to be careful to use the native
mutex rather than Ruby's in your Ruby threads.
Francis Cianfrocca wrote:
It seems to me that Ruby's green-thread implementation is perfectly
adequate
for most programmers' requirements.
But the problem is that it doesn't take advantage of these new
multi-core processor that are now starting to become the standard
machines being sold (at least for my customers).
I'm a newbie to Ruby and I really, really love it, but I think this
issue will start to become a serious issue for Ruby in the near future.
How does Python, Perl, PHP handle this (if at all)?
Thanks
--
Posted via http://www.ruby-forum.com/\.
Francis Cianfrocca wrote:
...
Threads can be used for two basic purposes: to make your programs run
faster, or to make them easier to write. Ruby's (and Java's) threads seem
designed primarily to facilitate the latter. You can easily imagine several
kinds of problems that are easier to model if you have access to relatively
independent flows of control. Thus both languages have the "synchronize"
method, taking an arbitrary code block, which makes it easy to lock
relatively large chunks of code in "critical sections" without having to
really design proper synchronization sets.
You probably mean "Thread.critical" or "Thread.exclusive", and not
"synchronize", at least in the context of ruby. (There is a
Mutex#synchronize and of course that does require you to think about
synchronization sets and ordering.)
But to effectively use threads for higher performance and concurrency
requires a large amount of experience and understanding, much of which
takes
platform dependencies into account. For just one example, I would want to
use a spin lock in some situations, if I'm running on a multi-processor
machine on certain hardware platforms. Ruby doesn't have one.
Doesn't have one and doesn't need one, as long as threads are green.
But, someday, when ruby has native threads, it will need spin locks.
--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
Almost true. In Solaris 8, you can link with liblwp to get lightweight
process threads. In Solaris 9 and 10 (*especially* 10), you can just
use pthreads and you'll be getting LWP threads.
-austin
On 5/27/06, Francis Cianfrocca <garbagecat10@gmail.com> wrote:
(Solaris uses an extremely complex hybrid model
which in my opinion has proven to be far more trouble than it's worth.)
--
Austin Ziegler * halostatue@gmail.com
* Alternate: austin@halostatue.ca
>>>>But the problem is that it doesn't take advantage of these new
multi-core processor that are now starting to become the standard
machines being sold (at least for my customers).
THAT is absolutely correct, and insightful. Many people have noticed that
raw processor speeds aren't increasing at nearly the same rate they once
did, and all the chip designers are going to some form of multicore
hardware. The most interesting one (to me at least) is the Cell, which
essentially requires a different programming model if you're going to get
the most out of it.
There's a great deal of controversy over this issue, with a lot of people
contending that C compilers bear most of the responsibility for effective
multicore scheduling. Speaking as an application programmer who has also
written a lot of compilers, I'm partially but not fully convinced by this. I
believe that significant changes in programming methodology will be
*required* in the future to write programs with acceptable performance.
Python is IMO quite a bit more sophisticated than Ruby in handling threads.
I don't rate Perl or PHP as serious contenders for thread-hot development
for several reasons. (Besides, they often run inside of Apache processes,
and Apache will naturally take some advantage of the newer hardware because
of its multiprocess nature.)
I come in for a lot of criticism because I don't care for the way many
programmers are trained to use threads. But I think the shortcomings of the
typical approach to threaded programming that is encouraged by languages
like Ruby, Java and even Python will be far more deleterious on the coming
hardware than they are today. Ironically, Java may have an edge because it
has some deployment systems that can partition programs into
indepedently-schedulable pieces. I'd like to see something similar for Ruby
(and have opened a project ("catamount") to do so) but it's still early.
it's a small problem. here is some code which starts two processes, three if
you count the parent. both run in separate processes using drb as the ipc
layer to make the communication painless. because the code uses drb the com is
simple. because it uses multiple processes it allows the kernel to migrate
them to different cpus. the cost is about 100 lines of pure-ruby (the slave
lib). notice how easy it is for parent to communicate with child and for
childrent to communicate with each other:
harp:~ > cat a.rb
require 'slave'
require 'yaml'
class ProcessA
def initialize(b) @b = b end
def process(n) @b.process(n * n) end
def pid() Process.pid end
end
class ProcessB
def process(n) n + 6 end
def pid() Process.pid end
end
b = Slave.new(ProcessB.new).object
a = Slave.new(ProcessA.new(b)).object
y 'a.pid' => a.pid
y 'b.pid' => b.pid
y 'answer' => a.process(6)
harp:~ > ruby a.rb
On Sat, 27 May 2006, ReggW wrote:
Francis Cianfrocca wrote:
It seems to me that Ruby's green-thread implementation is perfectly
adequate
for most programmers' requirements.But the problem is that it doesn't take advantage of these new multi-core
processor that are now starting to become the standard machines being sold
(at least for my customers).
---
a.pid: 15142
---
b.pid: 15141
---
answer: 42
this is one of those things that allows one to consider designs that would be
untenable in other languages. obviously using this approach it would be
trivial to setup a job that spawned 16 intercommunicating proccess, something
which would be absurd to code in c.
regards.
-a
--
be kind whenever possible... it is always possible.
- h.h. the 14th dali lama
You probably mean "Thread.critical" or "Thread.exclusive", and not
"synchronize", at least in the context of ruby. (There is a
Mutex#synchronize and of course that does require you to think about
synchronization sets and ordering.)
No, I mean Mutex#synchronize and its equivalents in Java and Python. Proper
synchronization design is a fine art, and highly hardware and OS dependent.
The simplicity of #synchronize encourages people not to learn it very
deeply. As I said upthread, the thread-support constructs provided by Ruby,
Python, Java and similar languages seem designed to facilitate the goal of
making threaded programming easier to do. This is of course a fine goal in
itself. But using threads to make programs faster and more concurrent is a
very different goal, one which IMO is NOT well supported by Java or any of
the agile languages.
Doesn't have one and doesn't need one, as long as threads are green.
But, someday, when ruby has native threads, it will need spin locks.
Fair enough as far as it goes. But green threads mean you can't take
advantage of multiprocessor hardware at all. (Python has the same
shortcoming, but for a different reason.) So as long as we're clear on
Ruby's goals (grace and ease of cross-platform development) and its
non-goals (performance and scalability), you don't need the more powerful
thread-handling constructs, and for now there's nothing wrong with that. But
all of this changes when serious multicore hardware like the Cell processors
become the norm. At that point, we'll all need to get a lot better at
programming multithreaded, multiprocess or event-driven, and our language
systems will have to evolve accordingly.
On 5/27/06, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:
Francis Cianfrocca wrote:
...
> Threads can be used for two basic purposes: to make your programs run
> faster, or to make them easier to write. Ruby's (and Java's) threads
seem
> designed primarily to facilitate the latter. You can easily imagine
several
> kinds of problems that are easier to model if you have access to
relatively
> independent flows of control. Thus both languages have the "synchronize"
> method, taking an arbitrary code block, which makes it easy to lock
> relatively large chunks of code in "critical sections" without having to
> really design proper synchronization sets.You probably mean "Thread.critical" or "Thread.exclusive", and not
"synchronize", at least in the context of ruby. (There is a
Mutex#synchronize and of course that does require you to think about
synchronization sets and ordering.)> But to effectively use threads for higher performance and concurrency
> requires a large amount of experience and understanding, much of which
> takes
> platform dependencies into account. For just one example, I would want
to
> use a spin lock in some situations, if I'm running on a multi-processor
> machine on certain hardware platforms. Ruby doesn't have one.Doesn't have one and doesn't need one, as long as threads are green.
But, someday, when ruby has native threads, it will need spin locks.--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
Have you investigated or played around with the concurrency model that Bertrand Meyer
has written about for Eiffel? Last time I checked it wasn't implemented but it seemed
like an interesting abstraction.
I do agree with you that it takes a lot of discipline to use threads effectively.
Many times it seems like a standard multi-process model would work just as well as
trying to play with fire in a shared address space. Unix used to be known for its
'cheap' processes and now everyone seems to think that process creation is monumentally
expensive.
The Plan 9 approach to the process/thread dichotomy is pretty interesting also.
Sometimes I think language design in the real world has been held back by the
limitations of the two generally available OS frameworks (Unix and Windows).
Gary Wright
On May 27, 2006, at 7:50 AM, Francis Cianfrocca wrote:
I come in for a lot of criticism because I don't care for the way many
programmers are trained to use threads. But I think the shortcomings of the
typical approach to threaded programming that is encouraged by languages
like Ruby, Java and even Python will be far more deleterious on the coming
hardware than they are today. Ironically, Java may have an edge because it
has some deployment systems that can partition programs into
indepedently-schedulable pieces. I'd like to see something similar for Ruby
(and have opened a project ("catamount") to do so) but it's still early.
cremes$ gem list -b |grep slave
slave (0.0.0)
slave
cremes$ gem install slave
Attempting local installation of 'slave'
Local gem file not found: slave*.gem
Attempting remote installation of 'slave'
ERROR: While executing gem ... (Gem::GemNotFoundException)
Could not find slave (> 0) in the repository
Ruh roh!
cr
Chuck Remes
cremes@mac.com
www.familyvideovault.com (not yet live!)
On May 27, 2006, at 10:14 AM, ara.t.howard@noaa.gov wrote:
On Sat, 27 May 2006, ReggW wrote:
Francis Cianfrocca wrote:
It seems to me that Ruby's green-thread implementation is perfectly
adequate
for most programmers' requirements.But the problem is that it doesn't take advantage of these new multi-core
processor that are now starting to become the standard machines being sold
(at least for my customers).it's a small problem. here is some code which starts two processes, three if
you count the parent. both run in separate processes using drb as the ipc
layer to make the communication painless. because the code uses drb the com is
simple. because it uses multiple processes it allows the kernel to migrate
them to different cpus. the cost is about 100 lines of pure-ruby (the slave
lib). notice how easy it is for parent to communicate with child and for
childrent to communicate with each other:harp:~ > cat a.rb
require 'slave'
require 'yaml'[snip cool example using 'slave']
You're making a very interesting point, one I've made many times: you're
saying to write cooperative multiprocess rather than multithreaded programs.
If you take aggregate costs into account (including time-to-market and
lifecycle maintenance and support), this approach can be far better than
multithreaded because it's so much more robust and easier to do. Whether
it's as fast, however, is a highly hardware and OS-dependent question. If
you can specify multiprocessor or multicore hardware, multiprocess software
design has a clear edge, IMO. And in a few years nearly all processors for
general computation will be multicore.
(This is a side point (and as we know, the side points always generate the
hottest flames), but I happen to disagree with your choice of DRb. Not
because of the communications model, but because distributed objects are
fundamentally problematic. I'd encourage you to look at multiprocess
event-driven systems. Watch for the upcoming pure-Ruby version of the
eventmachine library on Rubyforge- it will have built-in constructs to
explicitly support multiprocess event-driven programming.)
On 5/27/06, ara.t.howard@noaa.gov <ara.t.howard@noaa.gov> wrote:
On Sat, 27 May 2006, ReggW wrote:
> Francis Cianfrocca wrote:
>
>> It seems to me that Ruby's green-thread implementation is perfectly
>> adequate
>> for most programmers' requirements.
>
> But the problem is that it doesn't take advantage of these new
multi-core
> processor that are now starting to become the standard machines being
sold
> (at least for my customers).it's a small problem. here is some code which starts two processes, three
if
you count the parent. both run in separate processes using drb as the ipc
layer to make the communication painless. because the code uses drb the
com is
simple. because it uses multiple processes it allows the kernel to
migrate
them to different cpus. the cost is about 100 lines of pure-ruby (the
slave
lib). notice how easy it is for parent to communicate with child and for
childrent to communicate with each other:harp:~ > cat a.rb
require 'slave'
require 'yaml'class ProcessA
def initialize(b) @b = b end
def process(n) @b.process(n * n) end
def pid() Process.pid end
endclass ProcessB
def process(n) n + 6 end
def pid() Process.pid end
endb = Slave.new(ProcessB.new).object
a = Slave.new(ProcessA.new(b)).objecty 'a.pid' => a.pid
y 'b.pid' => b.pidy 'answer' => a.process(6)
harp:~ > ruby a.rb
---
a.pid: 15142
---
b.pid: 15141
---
answer: 42this is one of those things that allows one to consider designs that would
be
untenable in other languages. obviously using this approach it would be
trivial to setup a job that spawned 16 intercommunicating proccess,
something
which would be absurd to code in c.regards.
-a
--
be kind whenever possible... it is always possible.
- h.h. the 14th dali lama
>
> >>>>But the problem is that it doesn't take advantage of these new
> multi-core processor that are now starting to become the standard
> machines being sold (at least for my customers).
>THAT is absolutely correct, and insightful. Many people have noticed that
raw processor speeds aren't increasing at nearly the same rate they once
did, and all the chip designers are going to some form of multicore
hardware. The most interesting one (to me at least) is the Cell, which
essentially requires a different programming model if you're going to get
the most out of it.There's a great deal of controversy over this issue, with a lot of people
contending that C compilers bear most of the responsibility for effective
multicore scheduling. Speaking as an application programmer who has also
written a lot of compilers, I'm partially but not fully convinced by this. I
believe that significant changes in programming methodology will be
*required* in the future to write programs with acceptable performance.Python is IMO quite a bit more sophisticated than Ruby in handling threads.
Is this because Python uses native threads?
I don't rate Perl or PHP as serious contenders for thread-hot development
for several reasons. (Besides, they often run inside of Apache processes,
and Apache will naturally take some advantage of the newer hardware because
of its multiprocess nature.)I come in for a lot of criticism because I don't care for the way many
programmers are trained to use threads. But I think the shortcomings of the
typical approach to threaded programming that is encouraged by languages
like Ruby, Java and even Python will be far more deleterious on the coming
hardware than they are today.
Do you think that threads are just the wrong model or metaphore?
For example, Io has the concept of Actors.
Ironically, Java may have an edge because it
has some deployment systems that can partition programs into
indepedently-schedulable pieces. I'd like to see something similar for Ruby
(and have opened a project ("catamount") to do so) but it's still early.
Given that fork'ing a new process is pretty cheap (on Linux, at least)
is that perhaps a better way to acheive concurrancy for us in the
short term? (or course there are lots of of other issues then like
sharing data between processes).
...looking forward to hearing more aobut catamount.
Phil
On 5/27/06, Francis Cianfrocca <garbagecat10@gmail.com> wrote:
Is your Slave code available? (perhaps someone asked later; I miss
the newsgroup where I would be able to more easily tell if they did )
BTW: Just curious: Why are you require'ing Yaml? Are you marshalling
in Drb with Yaml instead of the builtin marshalling? If so, why? Is
it faster? (I wouldn't think so)
Phil
I really miss the gateway...
On 5/27/06, ara.t.howard@noaa.gov <ara.t.howard@noaa.gov> wrote:
On Sat, 27 May 2006, ReggW wrote:
> Francis Cianfrocca wrote:
>
>> It seems to me that Ruby's green-thread implementation is perfectly
>> adequate
>> for most programmers' requirements.
>
> But the problem is that it doesn't take advantage of these new multi-core
> processor that are now starting to become the standard machines being sold
> (at least for my customers).it's a small problem. here is some code which starts two processes, three if
you count the parent. both run in separate processes using drb as the ipc
layer to make the communication painless. because the code uses drb the com is
simple. because it uses multiple processes it allows the kernel to migrate
them to different cpus. the cost is about 100 lines of pure-ruby (the slave
lib). notice how easy it is for parent to communicate with child and for
childrent to communicate with each other:harp:~ > cat a.rb
require 'slave'
require 'yaml'class ProcessA
def initialize(b) @b = b end
def process(n) @b.process(n * n) end
def pid() Process.pid end
endclass ProcessB
def process(n) n + 6 end
def pid() Process.pid end
endb = Slave.new(ProcessB.new).object
a = Slave.new(ProcessA.new(b)).objecty 'a.pid' => a.pid
y 'b.pid' => b.pidy 'answer' => a.process(6)
harp:~ > ruby a.rb
---
a.pid: 15142
---
b.pid: 15141
---
answer: 42this is one of those things that allows one to consider designs that would be
untenable in other languages. obviously using this approach it would be
trivial to setup a job that spawned 16 intercommunicating proccess, something
which would be absurd to code in c.
Dammit! I was about to write this library!
(Mine was going to look a little different:
require 'task'
x = "Hello"
Task.new(x) { |o| puts o.upcase } # sets up Drb to have a connection
# to x, forks, connects to x via drb and
# and yields the DrbObject to the block
)
On May 27, 2006, at 11:14 AM, ara.t.howard@noaa.gov wrote:
On Sat, 27 May 2006, ReggW wrote:
Francis Cianfrocca wrote:
It seems to me that Ruby's green-thread implementation is perfectly
adequate
for most programmers' requirements.But the problem is that it doesn't take advantage of these new multi-core
processor that are now starting to become the standard machines being sold
(at least for my customers).it's a small problem. here is some code which starts two processes, three if
you count the parent. both run in separate processes using drb as the ipc
layer to make the communication painless. because the code uses drb the com is
simple. because it uses multiple processes it allows the kernel to migrate
them to different cpus. the cost is about 100 lines of pure-ruby (the slave
lib). notice how easy it is for parent to communicate with child and for
childrent to communicate with each other:harp:~ > cat a.rb
require 'slave'
require 'yaml'class ProcessA
def initialize(b) @b = b end
def process(n) @b.process(n * n) end
def pid() Process.pid end
endclass ProcessB
def process(n) n + 6 end
def pid() Process.pid end
endb = Slave.new(ProcessB.new).object
a = Slave.new(ProcessA.new(b)).objecty 'a.pid' => a.pid
y 'b.pid' => b.pidy 'answer' => a.process(6)
harp:~ > ruby a.rb
---
a.pid: 15142
---
b.pid: 15141
---
answer: 42this is one of those things that allows one to consider designs that would be
untenable in other languages. obviously using this approach it would be
trivial to setup a job that spawned 16 intercommunicating proccess, something
which would be absurd to code in c.regards.
-a
--
be kind whenever possible... it is always possible.
- h.h. the 14th dali lama
Francis, you should consider writing a book on advanced programming
concepts. You're a great communicator.
Michael
gwtmp01@mac.com wrote:
...
I do agree with you that it takes a lot of discipline to use threads
effectively. Many times it seems like a standard multi-process model
would work just as well as trying to play with fire in a shared
address space. Unix used to be known for its 'cheap' processes and
now everyone seems to think that process creation is monumentally
expensive.
Agree in general, but in the case of ruby, note that forking a ruby
process is more costly because of GC. In a short-lived child, GC can be
disabled to improve performance. [ruby-talk:186561]
--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
http://codeforpeople.com/lib/ruby/slave/
-Ezra
On May 27, 2006, at 11:19 AM, cremes.devlist@mac.com wrote:
On May 27, 2006, at 10:14 AM, ara.t.howard@noaa.gov wrote:
On Sat, 27 May 2006, ReggW wrote:
Francis Cianfrocca wrote:
It seems to me that Ruby's green-thread implementation is perfectly
adequate
for most programmers' requirements.But the problem is that it doesn't take advantage of these new multi-core
processor that are now starting to become the standard machines being sold
(at least for my customers).it's a small problem. here is some code which starts two processes, three if
you count the parent. both run in separate processes using drb as the ipc
layer to make the communication painless. because the code uses drb the com is
simple. because it uses multiple processes it allows the kernel to migrate
them to different cpus. the cost is about 100 lines of pure-ruby (the slave
lib). notice how easy it is for parent to communicate with child and for
childrent to communicate with each other:harp:~ > cat a.rb
require 'slave'
require 'yaml'[snip cool example using 'slave']
cremes$ gem list -b |grep slave
slave (0.0.0)
slave
cremes$ gem install slave
Attempting local installation of 'slave'
Local gem file not found: slave*.gem
Attempting remote installation of 'slave'
ERROR: While executing gem ... (Gem::GemNotFoundException)
Could not find slave (> 0) in the repositoryRuh roh!
cr
Chuck Remes
And perhaps also the fact that we've been stuck with the Von Neumann
architecture for so long... or may be we've been stuck with the Von
Neumann architecture for so long because our languages haven't evolved
in order to effectively model a different architecture?
Phil
On 5/27/06, gwtmp01@mac.com <gwtmp01@mac.com> wrote:
> Sometimes I think language design in the real world has been held
back by the
limitations of the two generally available OS frameworks (Unix and
Windows).