A Mental Model for C++ Coroutine

C++ coroutine is not a library that is ready to go (e.g. std::vector). It is not even a trait (think of Rust’s Future trait) that library writers or users can implement (or the compiler generates for you in the case of Rust). C++ coroutine is a specification that defines a set of customization points, that requires library writers to implement in order to get a functional coroutine.

A function supports two operations - call and return. A corotuine (in any language) is a generalization of a function. It supports suspend, resume, and destroy in addition of call and return ¹. By this definition, C++ coroutine, Rust Future, are both coroutines. C++ coroutine provides customization points around call, return, suspend, and resume; it exposes a handle that allows users to explicitly destroy a coroutine as well.

Example

Here is what a potential coroutine add could look like. All it does is sleep one second, add two numbers and return the result. add should suspend at co_await and resume execution once co_sleep(1) completes.

Task<int32_t> add(int32_t a, int32_t b) {
    uint32_t actualSleepSec = co_await co_sleep(1);
    co_return a + b;
}

Task<uint32_t> co_sleep(uint32_t sec) {
    // ...
}

Similar to many other languages, the presence of the keyword co_await signals that add is a coroutine. Let us pretend to be a compiler for a second, the only thing that can be used to customize the behavior of the coroutine is really just the return type Task<T>. And indeed that is where C++ extracts the promise_type to start the customization process. We will cover more about the promise_type later in the post.

The Obvious/Rust Approach

An obvious option here is to make Task<T> a subclass of some language defined interface (which owns the coroutine frame² for storing state across co_await points), where a set of customization points can be implemented as member functions to specify the behavior of this coroutine. If we lump everything into a single class (Task<T> in this case), resume, will be part of the public interface as well (usually suspend only takes place at well defined await points), as one need to be able to resume the coroutine somehow, just like Rust’s Future::poll. How does a coroutine know it is ready to be resumed? In the Rust’s case, it essentially passes a callback to the Future. The callback will resume (poll) the Future when its dependency is satisfied. In our example, the callback will resume the add coroutine after co_sleep is complete. But it still needs to somehow pass the return value from co_sleep to the caller. Rust solves this problem by passing the callback all the way from the top-most Future to the bottom Future, and whenever the state changes (because e.g. a network reply is received), it invokes the callback to reschedule the entire Future chain. In our example, the return value from co_await co_sleep(1); will be acquired on the next poll on add. This is a good way of modeling async computation. C++’s coroutine spec, which we will see in a minute is way more flexible. Actually it is so flexible that we can implement Rust Future’s poll-based semantics (barring the compile transformation of each Future’s state machine) with C++ coroutine.

C++ Coroutine philosophy

Flexibility - it allows one to customize every coroutine operation in meaningful ways. Most importantly,

what happens when a coroutien is first called
what happens before a coroutine returns a value
what happens right before a coroutine reaches the enf of its lifetime
when one wants to destroy a coroutine
what happens after a coroutine is suspended
what happens before a coroutine is resumed

So each coroutine can have its own unique behaviors.

The `promise_type`

As we observed earlier, the return type Task<T> is the best entry-point for users to customize coroutine behaviors. C++ coroutine requires a Task<T>::promise_type type be defined.

the promise type can specify the coroutine behavior - most importantly when add is called, before add returns, before the end of the coroutine’s lifetime (destroy)
the promise object (of the promise type) can be used to store additional state in the coroutine frame

When a coroutine is first called, C++ will automatically create a promise object for you, which lives inside the coroutine frame (that is allocated at the same time).

The Awaiter and the Awaitable

The promise_type covers the customization for call, return operation of a coroutine. We have suspend and resume left. Suspension takes place at well defined suspension points - initial_suspend(right after call), final_suspend(right before the end), co_await, co_yield. If one simply calls add(1, 2) with initial_suspend set to suspend_always, the coroutine is suspended. In most cases, one wants to customize the behavior of suspend so that the suspended coroutine can be resumeed later. A coroutine can be suspended for two main reasons

to provide a customization opportunity - e.g. initial_suspend and final_suspend
to wait for a dependency - a.k.a awaitable

In the first case, there is no dependency/awaitable. In the second case, it would be very helpful to have reference to the awaitable when customizing the suspend operation, so that we can resume the current coroutine when the awaitable is resolved. It would be kind of awkward to provide a suspend customization point that provides an awaitable sometimes. C++ coroutine solves this problem by introducing the awaitor concept. At co_await awaitable;, the compiler will create an awaiter object via calling the operator co_await override on the awaitable (let us ignore the await_transform for now). In most cases, the awaitable will be a data member of the awaitor object. Magic methods await_suspend, and await_resume on the awaiter object will be used to customize the suspend and resume operation of the coroutine. Most importantly, await_suspend will be called with an argument that represents the current coroutine, which is just suspended. So at this point, the Awaitor has reference of the awaitable, as well as the current coroutine. It allows us to set up when to call the resume operation. A common approach is to store the current coroutine handle in the awaitable’s promise object; when the awaitable reaches its end of lifetime (final_suspend), it can retrieve the awaiting coroutine and resume it. await_resume’s return value will be the value of the co_await expression. Usually the result can be passed via the awaitable’s promise object as well.

final_suspend returns an awaiter object. We have seen Awaitors already - suspend_always is an empty awaiter type that has await_ready returns false always. It is conceptually equivalent to putting co_await final_suspend(); right before the end. await_suspend can be customized for the final awaiter object, which will again have a handle to the current coroutine, where we can extract the promise object with whatever state we put in.

We are making a big assumption here that every awaitable itself is also a coroutine. It is likely true most of the time. To make it true all the time, it is where await_transform comes in. It provides yet another customization point. It allows the coroutine to transform (usually wrap) the expression in co_await expr; into an awaitable.

https://lewissbaker.github.io/2017/09/25/coroutine-theory ↩
a coroutine frame is where the async stackframe lives (conceptually on heap), and where additional state is kept for resuming the coroutine later ↩

Example

The Obvious/Rust Approach

C++ Coroutine philosophy

The promise_type

The Awaiter and the Awaitable

The `promise_type`