I’ve talked a lot about fibers before. But what are they and how do they work?
From the progammer perspective they can feel a lot like a thread. They have a stack just like a real thread. They maintain a program counter just like a real thread. They can spill registers to the stack like a real thread.
Most importantly is that you can call into GTK from your fiber if running on the main thread. Your fiber will have been dispatched from the main loop on the same OS thread so this is a great use of libdex.
There are downsides too though. You have to allocate stack and guard pages for them like a read thread. They have some cost in transitioning between stacks even if fairly low these days. You also need to be mindful to own the lifecycle of pointers on your stack if you intend to…
I’ve talked a lot about fibers before. But what are they and how do they work?
From the progammer perspective they can feel a lot like a thread. They have a stack just like a real thread. They maintain a program counter just like a real thread. They can spill registers to the stack like a real thread.
Most importantly is that you can call into GTK from your fiber if running on the main thread. Your fiber will have been dispatched from the main loop on the same OS thread so this is a great use of libdex.
There are downsides too though. You have to allocate stack and guard pages for them like a read thread. They have some cost in transitioning between stacks even if fairly low these days. You also need to be mindful to own the lifecycle of pointers on your stack if you intend to await.
Many fibers may work together on a single thread where each runs a little bit until yielding back to the scheduler. This is called “cooperative multi-tasking” because it is up to the fibers to be cooperative and yield when appropriate.
That means that you generally should not “block” when writing code for fibers. It not only blocks your own fiber from making progress but all other fibers sharing your thread.
The way around this is to use non-blocking APIs instead of blocking calls like open()
or read()
. This is where combining a library for Futures and a library for Fibers makes a lot of sense. If you provide asynchronous APIs based on futures you immediately gain a natural point to yield from a fiber back to the scheduler.
The scheduler maintains two queues for fibers on a thread. That is because fibers exist in one of two (well three sort of) states.
- The first state is “runnable” meaning the fiber can make progress immediately.
- The second state is “blocked” meaning the fiber is waiting on a future to complete.
- The (sort of) third state is “finished” but in libdex it would be removed from all queues here. When a fiber transitions from runnable to blocked its linked-list node migrates from one queue to the other. Naturally, that means we must be waiting for a completion of a future. To scheduler will register itself with the dependent future so it may be notified of completion.
Upon completion of the dependent future our fiber will move from the blocked to the runnable queue. The next GMainContext iteration will transition into the fibers stack, restore register values, set the instruction pointer and continue running.
Fibers get their stack from a pool of pre-allocated stacks. When they are discarded they return to the pool for quick re-use. If we have too many saved then we release the stacks memory back to the system. It’s all just mmap()
-based stacks currently.
You might be wondering how we transition into the fibers stack from the thread. Libdex has a few different strategies for that based on the platforms it supports.
Windows, for example, has native support for fibers in their C runtime. So it uses ConvertThreadToFiber()
and ConvertFiberToThread()
for transitions.
On Linux and many other Unix-like systems we can use makecontext()
and swapcontext()
to transition. There was a time when swapcontext()
was quite slow and so people used specialized assembly to do the same thing. These days I found that to be unnecessary (at least on Linux).
Another way to transition stacks is by using signalstack()
, but libdex does not use that method.
Libdex fibers work on Linux, FreeBSD, macOS (both x86_64
and aarch64
), Windows, and Illumos/Solaris. I believe Hurd also works but I’ve not verified that.
When your fiber first runs you will be placed at the base of your new stack. So if you found yourself in a debugger at your fibers entry point, it might look like you are one function deep. The first function from libdex would be equivalent to your _start
on a regular thread.
If you await while 5 functions deep then your stacks register state will be saved and then your stack is set aside. The fiber will transition back to the scheduler where it left off. Then the original thread state is restored and the fiber scheduler can continue on to the next work item.
At the core, the fiber scheduler is really just a GSource
within the GMainContext
. It knows when it can flag itself is runnable. When dispatched it will wake up any number of runnable fibers.
To make sure that we don’t have to deal with extremely tricky situations fibers may not be migrated across threads. They are always pinned to the thread they were created on. If that becomes a problem it is usually better to break up your work into smaller tasks.
Another feature that has become handy is implicit fiber cancellation. A fiber is itself a future. If all code awaiting completion of your fiber have discarded interest then your fiber will be implicity cancelled.
Where this works out much better than real thread cancellation is that we already have natural exit points where we yield. So when your fiber goes to dex_await()
it will get a DEX_ERROR_FIBER_CANCELLED
in a GError
. Usually when you get errors you propagate that rejection by returning from your fiber, easy.
If you do not want implicit fiber cancellation, you can “disown” your fiber using dex_future_disown()
.