Discussion: "Notes on structured concurrency, or: Go statement considered harmful"

You can use this thread to discuss the blog post Notes on structured concurrency, or: Go statement considered harmful.

Out of curiosity, what is your opinion about this post?
https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/

1 Like

Great article, I’m sold on Nurseries. You mentioned “The only other software I’ve used that thinks “print something and keep going” is a good error handling strategy is grotty old Fortran libraries, but here we are.”

It also reminds me of VB Script and VBA where people commonly wrote on error resume next at the top of the script without even thinking what that meant, and wondering why their code silently failed.

That’s probably my lack of Python knowledge, but in Automatic resource cleanup works. you write that you can’t access already close file handles. However, doesn’t this depend on ordering the with-blocks?

That is, this will work as described:

with open("my-file") as file_handle:
  async with trio.open_nursery() as nursery:
    nursery.start_soon(read_file(file_handle))

However, this will end in an error:

async with trio.open_nursery() as nursery:
  with open("my-file") as file_handle:
    nursery.start_soon(read_file(file_handle))

The problem here is that start_soon returns immediately, which exhibits goto-like control flow problems (as described). However they are localized, which is the important part :slight_smile:

I forgot to reply to this at the time, but it just came up again in chat, and I ended up writing a little reply that I think might be a good complement to Cory’s post that @belm0 linked to. So I’ll paste it here too:

yeah, python async/await is definitely a two-color system
I think my main frustration with that post is that it takes for granted that having two function colors is obviously a bad thing
…though to be fair, I can see how if you’re starting with js callbacks as your main experience with them, then it does feel pretty obviously problematic
but fundamentally, function color is a way to encode certain behavioral properties into your type system, like “this function does IO”. The thing about that property is that it’s transitive over call stacks: if A calls B and B does IO, then A does IO too. So if you think “this function does IO” is a useful thing to encode in your type system, then function color is how you do it. And that plausibly is a useful thing to include in a type system, e.g. haskell does it.
so IMO the question is really whether any given function color system has a good enough ergonomics/functionality tradeoff to be worthwhile.
in trio, the property it encodes is “could release the GIL and could be interrupted by cancellation”, which are both pretty important behavioral properties! putting them in the type system has a lot of value. and migrating to async/await is painful, but it’s way less painful than callbacks.

Yeah, I was being a bit hand-wavy there. The point is that you can now see which code is running inside the with block and which code isn’t: the nursery.start_soon call happens inside the with block, and indeed, the file will remain open for the start_soon call. But if you know what start_soon does (it schedules a task to start running in the near future, but returns before it’s actually started), then hopefully it should be pretty obvious that closing the file after calling start_soon is not what you want. And – crucially! – you can tell that just from looking at the source code inside the with block; you don’t have to go look inside read_file.

Here’s another example that might make this clearer, that doesn’t involve concurrency at all. It’s totally possible to use regular with blocks in a broken way too, for example:

with open("my-file") as file_handle:
    another_var = file_handle
read_file(another_var)

Obviously this code won’t work correctly, but you can write it if you want. So in general, with blocks don’t like, strictly guarantee that you can never access a closed file handle. (For that you need something like Rust’s lifetime tracking.) But in practice, it’s OK; people don’t write code like this by accident, because it’s obviously wrong. And once you’re used to nurseries, then your “bad” example is obviously wrong too, in basically the same way.

Hi,
I have just read your toughts on concurrency API and this post about “go statement”. First thank for these insights.
Something troubles me tought. It is about error handling. You said that:

As noted above, in most concurrency systems, unhandled errors in background tasks are simply discarded. There’s literally nothing else to do with them.

To be honest I am still a “newbie” in concurrent programming, and I just started using gevent and learning asyncio but I think your previous quote is not fair. Those two frameworks have a task/future mechanism to get an exception occuring in the task and returning it.

Also, I’m not sure that propagating errors to the parent task is the best strategy since it discards all the children tasks that are still running. In the context of an application server, I think it’s embarrassing that all user queries can be canceled because of an error that occurred on a single query.

Thanks in advance for the answers :slight_smile:

Right… every system has a way to handle errors explicitly – in regular Python or Trio you use try/except/finally; in asyncio or gevent you use try/except/finally+some custom method to explicitly check exceptions in background tasks. But the key word in the text you quoted is unhandled – sometimes people don’t write explicit error handling code like this, so as framework authors we have to decide what should happen. We have to pick some default behavior.

The Zen of Python suggests: Errors should never pass silently, unless explicitly silenced. That’s what Python and Trio do: exceptions keep going until you catch them or until the program exits. Of course this may not always be what you want (like in your application server), but the solution is easy enough: just catch the exception and do something else with it :slight_smile: It’s the safest default, but you can always override it.

In other frameworks like asyncio or gevent, if an error occurs in a background task, and you don’t remember to explicitly check for it, then they just print a message on the console and then discard the error. Sometimes, if you’re lucky, this might be what you want. But the framework really has no way to know which unhandled errors you intended to discard, and which ones are actually some serious bug that needs to be handled. So it’s a pretty dangerous thing to do by default.

Thank you for the reply. I’m currently reading trio tutorial, I’m gonna try it on.

1 Like

Thanks, yes you are right, you can always leak the file handle. Just wanted to make sure I understand how this works in Python correctly :slight_smile: