Still wondering how you handle blocking IO in fibers.
That wasn't an important feature for the intended purpose of the gem,
therefore there is no explicit support for it at the moment.. that might
seem like a cop out but it is exactly what I wanted (minimal features,
specific use-case).
But I get the impression you are dealing with various third-party libs
which might just open a socket and start talking? Couldn't that block the
fiber and therefore the whole thread?
That is the same problem you'd have for any sequential code, whether it is
running in a fiber or in an actor - calling something that blocks
indefinitely - but I think as a user you'd be aware of this. I'm not
proposing a solution to this problem, I think that's probably impossible
anyway.
This has always seemed to me to be the compelling feature of ruby's
threads: you just let the thread scheduler manage blocking.
The thread scheduler may seem like a good idea in theory, but in practice
event driven code that works with OS primitives (select, epoll, kevent) is
generally more efficient. I think there are good arguments either way (e.g.
SUN UltraSparc chips seemed to be designed for thread-based workloads,
running up to 64 threads in parallel, a bit like HyperThreading in x86),
but event driven systems generally seem easier to reason about, give more
predictable behaviour, better defined resource usage, etc. Also, as
mentioned, while some implementations use green threads, not all
implementations are using green threads. That means that if you use
threads, you need to deal with reentrancy and contention issues - at least
the same, if not more, complex than dealing with fibers (e.g. calling fork
might break everything when using threads, as mentioned).
Thanks for the example code. I'm sure that can be done more efficiently and
cleanly by having one function calling #select and resuming the correct
fiber.
Thanks for your ideas and feedback.
Kind regards,
Samuel
···
On 13 March 2014 13:56, Joel VanderWerf <joelvanderwerf@gmail.com> wrote:
Still wondering how you handle blocking IO in fibers.
If all of the code inside the fiber is under your control, you can use
non-blocking operations, and Fiber.yield if the operation would block. (See
example below.)
But I get the impression you are dealing with various third-party libs
which might just open a socket and start talking? Couldn't that block the
fiber and therefore the whole thread?
This has always seemed to me to be the compelling feature of ruby's
threads: you just let the thread scheduler manage blocking.
For anyone else who's reading and hasn't played with fibers, here's what
you can do to avoid blocking the whole thread while one fiber waits for
input:
----
require 'socket'
require 'fiber'
s1, s2 = UNIXSocket.pair
f = Fiber.new do
loop do
begin
puts "Fiber checking for available data"
data = s1.read_nonblock(10)
puts "Fiber received #{data.inspect}"
rescue IO::WaitReadable
puts "Fiber yielding"
Fiber.yield
puts "Fiber resuming"
unless IO.select([s1], , , 0)
puts "..even though no data is available"
end
retry
rescue => ex
puts ex
end
end
end
f.resume
f.resume
puts "writing to socket"
s2.write "123456"
f.resume
f.resume
puts "writing to socket"
s2.write "abcdef"
f.resume
f.resume
On 03/12/2014 05:03 PM, Samuel Williams wrote:
> Even green threads have this danger, don't they?
Yes, but in this context, I'm actually not sure I'd call the manual
scheduling a danger. While it could be referred to as explicit
scheduling, I prefer to look at as providing a specific, well defined,
non-blocking API with explicit synchronisation points.
(I think what I really like about fibers is they make it very easy to
compose concurrent code in a predictable way. For all intents and
purposes, the code is still sequential with very little overhead.)
Taking over manual scheduling seems a bit awkward compared to using some
kind of concurrency control (mutexes, queues, actors).
I would have said the opposite. Code using threads is typically very
hard to reason about compared to sequential code (like the API I've
proposed).
Except in specific situations (e.g. game engines, data
processing/access, algorithms/compression), I find threading causes more
problems than it solves (e.g.
http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-
before-using-them).
Even debugging code with threads can be a nightmare - why is there a
deadlock - why is there memory corruption - etc. The only situation
where I've seen this working well in general is in
languages/environments designed from the ground up to support parallel
processing (e.g. haskell, clojure, etc). Everything else seems like a
hack that requires careful analysis to verify correctness and the path
to the dark side is always just one (poorly chosen) line of code away..
Anyway, basically, I really like fibers - if you want to run concurrent
unix processes, this gem is a good starting point.
Thanks for your thoughts and input.
Kind regards,
Samuel
On 13 March 2014 09:04, Joel VanderWerf <joelvanderwerf@gmail.com >> <mailto:joelvanderwerf@gmail.com>> wrote:
On 03/12/2014 06:22 AM, Samuel Williams wrote:
Threads are good but I felt like I wanted something more
predictable.
Also, not all implementations of Ruby use green threads and
therefore
might have synchronisation issues if you use (either directly or
indirectly through a gem/library) shared global state.
Even green threads have this danger, don't they?
Taking over manual scheduling seems a bit awkward compared to using
some kind of concurrency control (mutexes, queues, actors). What
happens if application code inside the fiber
(process_and_email_results in the example) makes a blocking IO call?
Manual scheduling with fibers is great for testing concurrent code
which would otherwise run in threads, because you can force a
certain kind of contention in a predicable way. I'm working on
extracting a library for doing this from a project where it's been a
useful technique.