libev_scheduler
libev_scheduler
is a libev-based fiber scheduler for Ruby 3.0 based on code
extracted from Polyphony.
Installing
$ gem install libev_scheduler
Usage
Fiber.set_scheduler Libev::Scheduler.new
Fiber.schedule do
do_something_awesome
end
Also have a look at the included tests and examples.
The scheduler implementation
The present gem uses libev to provide a performant, cross-platform fiber scheduler implementation for Ruby 3.0. The bundled libev is version 4.33, which includes an (experimental) io_uring backend (more below about io_uring).
Some thoughts on the Ruby fiber scheduler interface
The fiber scheduler interface is a new feature in Ruby 3.0, aimed at facilitating building fiber-based concurrent applications in Ruby. The current specification includes methods for:
- starting a non-blocking fiber
- waiting for an
IO
instance to become ready for reading or writing - sleeping for a certain time duration
- waiting for a process to terminate
- otherwise pausing/resuming fibers (blocking/unblocking) for use with mutexes, condition variables, queues etc.
However, the current design has some shortcomings that will need to be addressed in order for this feature to become useful. Here are some of my thoughts on this subject. Please do not take this as an attack on the wonderful work of the Ruby core developers. Most probably I'm just some random guy being wrong on the internet :-p.
Two kinds of fibers
One of the changes made as part of the work on the fiber scheduler interface in Ruby 3.0 was to distinguish between two kinds of fibers: a normal, blocking fiber; and a non-blocking fiber, which can be used in conjunction with the fiber scheduler. While this was probably done for the sake of backward compatibility, I believe this is an error. In introduces ambiguity where previously there was none and makes the API more complex that it could have been.
It seems to me that a more logical solution to the problem of maintaining the blocking behaviour by default, would be have been to set the non-blocking mode at the level of the thread, instead of the fiber. That also would have allowed using the main fiber (of a given thread) in a non-blocking manner (see below).
Performing blocking operations on the main fiber
While I didn't scratch the surface too much in terms of the limits of the fiber
scheduler interface, it looks pretty clear that the main fiber (in any thread)
cannot be used in a non-blocking manner. While fiber scheduler implementations
can in principle use Fiber#transfer
to switch between fibers, which will allow
pausing and resuming the main fiber, it does not seem as if the current design
is really conductive to that.
I/O readiness
In and of itself, checking for I/O readiness is nice, but it does not allow us to leverage the full power of io_uring on Linux or IOCP in Windows. In order to leverage the advantages offered by io_uring, for instance, a fiber scheduler should be able to do much more than just check for I/O readiness. It should be able, rather, to perform I/O operations including read/write, send/recv, connect and accept.
This is of course no small undertaking, but the current Ruby native I/O
code, currently at almost 14
KLOCS, is IMHO ripe for some overhauling, and maybe some separation of concerns.
It seems to me that the API layer for the IO
class could be separated from the
code that does the actual reading/writing etc. This is indeed the approach I
took with Polyphony, which
provides the same IO
API for developers, but performs the I/O ops using a
libev- or io_uring-based backend. This design can then reap all of the benefits
of using io_uring. Such an approach could also allow us to implement I/O using
IOCP on Windows (currently we can't because this requires files to be opened
with WSA_FLAG_OVERLAPPED
).
This is also the reason I have decided not to release a native io_uring-backed fiber scheduler implementation (with code extracted from Polyphony), since I don't believe it can provide any real benefit in terms of performance. If I/O readiness is all that the fiber scheduler can do, it's probably best to just use a cross-platform implementation such as libev, which can then use io_uring behind the scenes.
Waiting for processes
The problem with the current design is that the #process_wait
method is
expected to return an instance of Process::Status
. Unfortunately, this class
cannot be
instantiated, which
leads to a workaround using a separate thread.
Another difficulty associated with this is that for example on libev, a child
watcher can only be used on the default loop, which means only in the main
thread, as the child watcher implementation is based on receiving SIGCHLD
.
An alternative solution would be to use pidfd_open
and watch the returned fd
for readiness, but I don't know if this can be used on OSes other than linux.
While a cross-OS solution to the latter problem is potentially not too
difficult, the former problem is a real show-stopper. One solution might be to
change the API such that #process_wait
returns an array containing the pid and
its status, for example. This can then be used to instantiate a
Process::Status
object somewhere inside Process.wait
.
On having multiple alternative fiber scheduler implementations
It is unclear to me that there is really a need for multiple fiber scheduler implementations. It seems to me that an approach using multiple backends selected according to the OS, is much more appropriate. It's not like there's going to be a dozen different implementations of fiber schedulers. Actually, libev fits really nicely here, since it already includes all those different backends.
Besides, the term "fiber scheduler" is a bit of a misnomer, since it doesn't really deal with scheduling fibers, but really with performing blocking operations in a fiber-aware manner. The scheduling part is in many ways trivial (i.e. the scheduler holds an array of fibers ready to run), but the performing of blocking operations is much more involved.
There is of course quite a bit of interaction between the scheduling part and
the blocking operations part, but then again to me a more sensible design would
have been to do everything related to scheduling inside of the Ruby core code,
and then offload everything else to a BlockingOperationsBackend
implementation. Here's what it might look like:
# example pseudo-code
class BlockingOperationsBackend
def poll(opts = {})
ev_run(@ev_loop)
end
def io_wait(io, opts)
fiber = Fiber.current
watcher = setup_watcher_for_io(io) do
Thread.current.schedule_fiber(fiber)
end
Fiber.yield
watcher.stop
end
...
end
The fiber scheduling part would provide a Thread#schedule_fiber
method that
adds the given fiber to the thread's run queue, and the thread will know when to
call the backend's #poll
method in order to poll for blocking operation
completions. For example:
# somewhere in Ruby's kischkas:
class Thread
def schedule_fiber(fiber)
@run_queue << fiber
end
def run_fiber_scheduler
@backend.poll
@run_queue.each { |f| f.resume }
end
end
It seems to me this kind of design would be much easier to implement, and would lead to a lot less code duplication. This design could also be extended later to perform all kinds of blocking operations, such as reading/writing etc., as discussed above.
Finally, such a design could also provide a C API for people writing extensions, so they can rely on it whenever doing any blocking call.