Structured concurrency in Rust

I’m not sure if you realized it yet, but a similar cancellation mechanism is implemented since 20 years in one of the most utilized programming languages: In Java!

As far as I understand, CancelRequestedError is basically equivalent to Java’s InterruptedException, which can be thrown by blocking APIs when external cancellation is requested. Of course what you describe is more powerful, since Java doesn’t have a CancelScope, and the scope of cancellation is a whole Thread. However Java has demonstrated that the cancellation errors are not properly used. I’ve seen enough try/catch blocks that don’t know what to do with that error and just ignore it.

I think the naming of CancelRequestedError might be a bit better than the Java version, and help people understand better what they are supposed to do. Maybe also the fact that cancellation in async code is far more typical than in synchronous code makes it easier to teach. But it will still be a challenge. Maybe on par with the CancellationToken/Context approach.

How would the return type for those cancellable operations then look like? Something like Result<Result<SuccessType, DomainSpecificError>, CancelRequestError>?

That seems a bit unergonomic for most operations. But we also can’t add the cancellation error to all other error code enumerations.

Looking forward to the next @njs blog post:
“Dear language designers: Don’t go back to the futures!”

3 Likes

Given how different Rust futures are to asyncio futures, it is unfortunate that they share the same name. The fact that Rust “async functions return futures” gives the reader used to Python a completely wrong idea of Rust async. To disentangle the mess, I’ve found a more useful mapping to be:

  • Rust future trait is equivalent to the Python awaitable protocol.
  • The concrete (opaque) type that implements future returned by a Rust async fn is roughly equivalent to Python’s concrete coroutine object type.

The fact that both Python and Rust provide a distinction between the interface (Rust future/Python awaitable) and the implementation (Rust impl Future created by async fn/Python coroutine object returned by invoking a coroutine function) is very useful because it allows innovations in how coroutines are actually implemented. For example, in Python it allows cython to implement efficient coroutines, or extensions to expose coroutines from languages like Rust at Python level. In Rust it allows you not to care whether a future is implemented by async fn or by manually implementing the trait, where the reason for the latter might also be the desire to expose a more basic system, e.g. through C FFI.

My personal feeling is that the this adds a ton of complexity without adding much value

I’m curious what complexity you are referring to here - the larger API surface and difficulty in teaching the language, or complexity in the implementation and usage of futures? Because at this level the only difference between Rust futures and Python awaitables is that the former, despite being low-level and light, offer the combinator API. But that doesn’t seem like a big problem in practice. As a future implementor, you get the combinators for free because they are automatically provided by the trait, and you get them regardless of whether your future comes from async fn or from a type that implements the trait manually.

I’m talking about the API surface complexity. Both in terms of teaching (cf the comment above that “Rust has already used most of its strangeness budget”), and in terms of how features like this interact in complicated ways with other parts of the system – complexity costs are super-linear. For example, there’s a whole discussion here about how dropping Futures has tricky semantics and maybe it would be better to handle cancellation another way – but if Futures are exposed, then you’re forced to also decide what happens when they’re dropped, and all Future code has to be prepared to be dropped, etc. In the approach I’m advocating, this tricky design problem simply disappears.

Yes, that’s right. Having seen how this turned out, I also think that Python would have been better off with just the coroutine object type – that experience is a major part of why I’m hoping Rust doesn’t make the same mistakes :-).

I don’t think this it’s actually useful to allow for innovations here. Think of Rust functions (ones with no explicit ABI tag). IIRC Rust doesn’t guarantee anything about their ABI; the compiler is free to change how Rust functions are implemented. And in practice, this is fine. If you have some other kind of functions (e.g., C functions), you can still call them via a translation layer; and the compiler even has some built-in features for generating these translation layers in specific cases. None of that requires that Rust publish and fix the Rust function ABI as part of the Rust API.

We could handle coroutines the same way – the main difference is that having alternative implementations is actually less useful for coroutines. For functions, there’s an existing de facto standard ABI you need to support for FFI (i.e., the C function ABI). For coroutines, there is no existing de facto standard like this. And since coroutines are necessarily tied to some kind of event loop / coroutine runner, I doubt there ever will be one. You can still interoperate with existing libraries – it just happens at a slightly higher level, by writing native async functions that wrap the other library and handle the translation between the semantics.

In Trio, we only use Python’s concrete coroutines, and systematically pretend that the whole “awaitable” protocol doesn’t exist. It works fine. We wrap C libraries all the time. Not a problem at all.

In practice, there are three kinds of independent implementations of awaitables in Python:

  • legacy libraries that use Future-like abstractions, like tornado/asyncio/twisted. This is the backwards-compatibility case that I’m arguing doesn’t provide enough value to be worth the costs.
  • library authors that exploit the awaitable protocol to do unnecessarily wacky things, that make their APIs weirder and more confusing. I suspect these will disappear as we collectively get more experience with what kinds of usage are considered tasteful, so it’s not a huge negative, but it’s a negative.
  • Cython. This is a very special case, tied up in very specific details of Python that don’t apply to Rust. Cython doesn’t just implement the awaitable protocol; it does the best it can to implement the concrete coroutine object type. (It even monkeypatches Python’s type introspection APIs to make Cython’s coroutine type look like Python’s coroutine type.) The reason they do this is because Cython’s goal is a second independent implementation of all of Python’s semantics, with seamless interoperability, and the only difference being that it’s statically compiled. The Rust equivalent would be to try to support seamless cross-linking between crates that were compiled with different independent implementations of the Rust compiler. This is not something Rust cares about, and if that ever changes then it will take a lot more than a Future API to make it possible.
1 Like

Java’s Thread.interrupt is an interesting case study, yeah.

At a high-level, there are only two states that a cancellation API ecosystem can be in: either almost everyone writes cancel-safe code, or almost no-one writes cancel-safe code. This is because cancel-safety is an AND property – a complex routine can only be cancel-safe if all the subroutines it uses are also cancel-safe. So if any measurable fraction of the libraries in your ecosystem aren’t cancel-safe, then in complex programs you have to assume that cancellation just isn’t safe to use at all. And that means that there’s no motivation to worry about making your code cancel-safe either. So none of your users can use the cancel system either… and the spiral continues.

To avoid this, it’s not enough for any specific person to try to write cancel-safe code; you need to design things so that for 99% of devs, the perceived benefits of writing cancel-safe code outweigh the perceived costs.

And to make things worse, writing cancel-safe code is just intrinsically, unavoidably harder than writing non-cancel-safe code. It’s something that you have to be thinking about in the back of your mind, all the time. You can’t avoid this with clever API design; It’s just inherent in the idea. For example, think of some code that takes a lock, manipulates the locked data, and then releases the lock. If it gets cancelled in the middle, while the data is in an inconsistent state, what happens? If the programmer doesn’t make sure to unwind the inconsistent state correctly, then everything will be screwed up.

(Side note: This is also why Rust mutexes are poisoned on panic. I don’t think Rust’s mutexes are currently poisoned if a Future is dropped while holding a lock. That seems like a dangerous choice to me, given that Future cancellation currently has the same control flow as panic.)

So if you’re trying to design a cancellation system, this is a pretty dire problem. Obviously part of the solution is to make it as painless as possible to write cancel-safe code – but we can’t expect to eliminate the pain entirely. There are always going to be perceived costs. And we can try to force people to support cancellation, basically increasing the pain of not handling it right, but this doesn’t go very far. Our only hope is to somehow provide one heck of a carrot, so that everyone can feel the value they’re getting from the cancellation system – not just us concurrency nerds.

This is where Thread.interrupt falls down. Cancelling a thread is a fairly rare requirement. It’s a real requirement; when you need it, you really need it! But there are also a lot of people who have never needed it, or who just aren’t experienced enough yet to realize why it’s important. And since they don’t feel the problem themselves, they aren’t motivated to understand the solution or spend energy handling it correctly. Yet, Java forces them to do something, because InterruptedError is a checked exception. So, you get the half-assed code you mentioned that just catches and discards it, and then this starts the spiral I talked about above, where if interruption doesn’t work reliably, they why should anyone else bother supporting it either…

I think there is one carrot that might work: timeouts. Everyone who does any kind of networking has to handle timeouts. They’re ubiquitous, and in most APIs trying to manage them correctly is a constant source of frustration. So if you can tell people hey, our cancellation system makes timeout handling easy and frictionless, then that’s really compelling. But, Java missed this opportunity: Thread.interrupt is totally useless for timeouts! Even APIs like HttpClient that do support Thread.interrupt have a totally different method for managing timeouts.

Cancel tokens have the opposite problem: they provide a clean solution to the timeout problem, so they have a lot of perceived value, but there’s too much perceived drudgery in passing them around everywhere and integrating them into different blocking constructs, so programmers rebel. And then you end up with situations like C# or Go, where they have this lovely cancel token system but even their own socket libraries don’t support it, and here comes our spiral. Or in table form:

Design approach Perceived drudgery Perceived value
Thread.interrupt Low Low
Cancel tokens High High
Cancel scopes Low High
2 Likes

Let’s nail down some terminology. In the coroutine approach, there are two separate types: there are async functions, what you get by writing async fn .... They’re a thing you can call. And there are coroutine objects, which represent a call-in-progress.

So at the implementation level, yeah, the two approaches are isomorphic: async functions are like Future-returning-functions, and coroutine objects are like Futures.

But at the user level, the coroutine approach is simpler. With Futures, there are three steps: call the Future-returning-function → get a Future → await the Future to get a regular value. With coroutines, the last two steps are collapsed together inside the compiler, so it’s just: call an async function → get back the value it returned.

It has to be an async function, not a coroutine object. I’m not sure if this is what you meant or not :-). But it’s very straightforward, just like any other higher-order function in Rust, just with some asyncs added:

// Return value simplified, see below for more discussion
async fn apply<T>(fun: AsyncFnOnce() -> T) -> T

OTOH the natural Future-based version would be:

fn apply<T>(fut: Future<T>) -> Future<T>

This is really different – remember that a Future is equivalent to a coroutine object, not an async function.

Here’s a straw man proposal for how this could all fit together:

We define async fn as a new concrete type that’s built into the compiler, just like fn. We also define a concrete type CoroutineCall to represent an in-progress coroutine.

There are two things you can do with an async fn:

  1. You can call it, just like a regular function, except there’s a restriction: this can only be done from inside another async function.

  2. There’s an operation – with special compiler support – that lets you extract a CoroutineCall object from an async fn. Maybe something like: unsafe fn coroutineify(fun: AsyncFnOnce() -> T) -> CoroutineCall<T>.

As you can see, there are trait versions of async fn, just like for fn. And, just like for fn, these are defined in terms of the concrete type async fn. E.g.:

#[lang = "async_fn_once"]
#[must_use]
pub trait AsyncFnOnce<Args> {
    type Output;
    extern "rust-call" async fn call_once(self, args: Args) -> Self::Output;
}

(Compare to the definition of FnOnce.)

I’ll be vague about the exact API for CoroutineCalls, but they would provide some sort of mechanism to step them, get back values when they suspend themselves, etc. I guess they could have a corresponding trait, beyond just the concrete object – it wouldn’t cause any particular problem – but I don’t see how it would be useful right now either, so I’ll leave it out.

Not so! This system is flexible enough to handle all that stuff. If you don’t believe me, check out Trio, which actually does it :-). The primitives you need are:

  • A way to suspend the current coroutine stack
    • while specifying how to wake it up if it’s cancelled
  • A way to request the coroutine stack be scheduled again

All I/O operations and “leaf futures” can be implemented in terms of these. Throw in nurseries and cancel scopes (which can be implemented inside the coroutine runner library, no need to be in std), and now you can do anything.

There are different ways to split up this functionality between the language runtime vs libraries. In Python, we just have “suspend with value” and “resume with value”, and then Trio implements these primitives on top of those. In Rust, the static typing, lifetimes, and desire for minimal overhead mean that you’d probably want to iterate a bit to find exactly the right way to encode these core primitives. But as a straw man, maybe a wait-for-control-C primitive might look roughly like this:

async fn wait_for_ctrlc() -> Result<(), CancelRequested> {
    suspend!(
        // first argument: setup function
        // arranges to call request_wakeup(value) at some future point
        |request_wakeup| {
            ctrlc::set_handler(move || request_wakeup());
        },
        // second argument: abort function
        // arranges that request_wakeup(value) *won't* be called
        || {
            ctrlc::unset_handler();
            AbortSucceeded
        },
    )
}

Notes:

  • All the language runtime has to support here is suspending the coroutine stack and passing these two callbacks out to the coroutine executor. Everything else can be done by the coroutine executor. (Subtlety: only the coroutine runner knows how to add a coroutine back to its scheduling queue, so we let it choose how request_wakeup is implemented.)

  • The protocol for handling aborts has to be somewhat complicated, to handle all the cases of (1) abort is easy and can be done immediately, like here, (2) abort is impossible, (3) abort is possible, but happens asynchronously, like with IOCP. This is the simple case, as indicated by the AbortSucceeded return value.

  • Thinking about lifetimes… if we want to avoid heap allocations, then I think we would need to say that if the abort function returns AbortSucceeded, then that invalidates the request_wakeup callback, which would mean we needed some unsafe in here. Maybe this can be avoided by being clever somehow. If not, then I don’t think it’s a big deal? Very few people need to implement their own I/O primitives from scratch using the lowest-level API. And the common cases for this are like, implementing your own async-friendly Mutex, which is an extremely tricky thing to do no matter how you slice it.

This part I do agree with :-). If you want to pass between the old Future style and the new async fn style, you could do that by writing async fn await<T>(fut: Future<T>) -> Result<T, CancelRequested>. It would look something like our wait_for_ctrlc function: use combinators to create a new Future that selects between the user’s future and a second future that gets resolved by the abort function, and then calls request_wakeup when that resolves, and then tokio::executor::spawn this new Future and go to sleep.

But, Futures do become a second-class citizen: they need to be explicitly awaited, they use a different cancellation system, etc. I think people would migrate away from them over time.

What a great question! I don’t know :-).

I guess for apply, being the only thing that can return CancelSucceeded, the double-nested-Result might indeed make sense. But I think you’re right, for the majority of operations, all of which can return CancelRequested, using double-nested-Result would probably suck.

In general, the functions that can return CancelRequested are functions that do I/O. And in general, functions that do I/O already have a ton of different error types. So I was imagining that we’d embed CancelRequested into the various I/O error enums. Maybe impl TryFrom<YourFavoriteErrorType> for CancelRequested? This is a place where my lack of Rust expertise means I really can’t tell for sure what would make sense, so maybe this is where the whole idea founders! I would be interested to hear folks thoughts.

2 Likes

Ok, I didn’t know that - I automatically assumed that while Trio avoided “fat” asyncio-style awaitables (awaitable objects providing additional functionality and API), but was fine with however the awaitable itself is implemented.

The awaitable protocol seems like a reasonable answer to the question “how do I implement an async def in C” or “how do I await in Python/C”? That kind of thing sometimes pops up on StackOverflow, and while the solution is far from pretty, it’s at least possible. I understand why you’d want to avoid asyncio futures with their unstructured and callback-centered API, but awaitables seem little more than a thin duck-typing generalization of Python’s existing implementation of coroutines. What is the benefit of ignoring that?

Purely implementational innovations, like Cython, seem like a plus. A more far-fetched example would be a native wrapper over a Rust or C++20 coroutine. Sure, you could still achieve a similar effect by creating an async def and using it to drive the foreign coroutines over a wrapper, but this has two drawbacks:

  • It is slower. As anyone who does Python for a living knows all too well, Python abstractions are high-cost, and every additional one (on a hot path) affects performance of the system.

  • It breaks with the broader Python conception of decoupling the virtual machine and object system from the interpreter. Most of Python doesn’t care if a callable is created with def, as long as it implements __call__, or whether an iterable is created using a generator or an itertool, as long as it implements __iter__, and so on. There are exceptions to this, and coroutines could be another, but the benefits need to clearly outweigh the downsides.

(I think this is an instance of some of the general difficulties involved with Rust’s error handling model – using nominal sum types to represent possible errors works (and to be clear, the benefit of compiler-checked exhaustiveness means I prefer it to most of the mainstream alternatives!), but achieving compositionality can be laborious. Having either row polymorphic variants or effect types, so that the set of possible errors can be expanded or contracted locally, as needed, rather than having to define new nominal enums and embeddings between them by hand (or working with nested Results, or whatever), would probably be helpful here.)

Of course, we have to think in terms of actually-existing Rust. Given that ErrorKind already contains things like WouldBlock, TimedOut, and Interrupted, your assumption that we could just add Cancelled to the list (or maybe repurpose one of the others?) doesn’t seem unreasonable to me on its face. But I’m not really familiar with the details w.r.t. what consequences that would have.

How impractical or inadvisable would it be to fold the cancellation effect into async itself? That is, any async fn has the potential to be cancelled, and when calling one, it automatically checks whether it has been (as if the ?-for-cancellation were implicit), so cancellation automatically propagates (if I understand things correctly) back up to the point where it was requested.

Is the issue that for the less-simple “(2) abort is impossible, (3) abort is possible, but happens asynchronously, like with IOCP” cases, automatic propagation is not what you want, and you need to do something else when you get a CancelRequested error? (And is that also the problem with the Drop-based approach, which the paragraph above otherwise resembles?) If that’s the case, could it be solved by making intercept-cancellation (rather than propagate) the special case, maybe with some sort of async fn catch_cancel<T>(impl AsyncFnOnce() -> T) -> Result<T, CancelRequested> primitive?

So, CoroutineCall will have the same core API as the Future trait? The difference would be that you don’t use it directly, so you don’t have a slew of monadic combinators.

pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, lw: &LocalWaker) -> Poll<Self::Output>;
}