Python’s built-in logging and other similar systems output a stream of factoids: they’re interesting, but you can’t really tell what’s going on.
Why is your application slow?
What caused this code path to be chosen?
Why did this error happen?
Standard logging can’t answer these questions.
But with a better model you could understand what and why things happened in your application. You could pinpoint performance bottlenecks, you could understand what happened when, who called what.
That is what Eliot does. eliot is a Python logging system that outputs causal chains of actions : actions can spawn other actions, and eventually they either succeed or fail . The resulting logs tell you the story of what your software did: what happened, and what caused it.
I think this bit at the top of that page might be referring just to asyncio, and didn’t get updated when trio was added?
On Python 3.7 or later, no particular care is needed. For Python 3.5 and 3.6 you will need to import either eliot (or the backport package aiocontextvars) before you create your first event loop.
It might be interesting in the example to also show what happens if you put the action inside the nursery, around the start_soon calls, and explain the difference. (Well, I’m curious what Eliot will do here anyway :-). And I’ve noticed people get confused about this in other cases, e.g. using a with semaphore to limit how many calls to start_soon can happen at a time, and not understanding why that doesn’t limit the number of tasks.)
The top-level action will end as soon as the start_soon calls are scheduled and the context manager exists, because it doesn’t know to wait for the nursery to end. And since the top-level action will end before the presumed child actions start, the child actions will be independent top-level actions.
In an ideal world the Eliot support would be built-in to Trio, and then the nursery context manager would ensure the continuity of actions…
I filed an issue to cover both points in the docs.
That makes sense. I wasn’t sure if eliot had a way to link together actions beyond nesting, like “this action was kicked off by that other action”. It’s a much more common pattern in Twisted/asyncio than it is in Trio, so it seemed like maybe it would…
In asyncio, if I write:
async def f():
async def g():
Then there’s a 50% change that the "f" action will be nested within the "g" action, and a 50% chance that they’ll be independent actions?
For the asyncio example, the random sleep is irrelevant to nesting. The nesting is determined based on the time start_action is called. So whether or not it’s independent actions depends on how asyncio contextvars integration works—i.e. whether the started task counts as an independent context or inherits the context.
As far as the two trees: if you look at locations in the tree (main/1, main/2, say/3/1, say/4/1), really those should be one tree.
They are displayed different trees in eliot-tree output as an artifact of how the parser works: it see the start message and finish message for the tree for a particular UUID, there are no missing messages in between, so it assumes it’s done and outputs the finished tree. (Basically you shouldn’t really be adding child actions to a finished action, though it is possible.)
Add Eliot support to Trio.
Eliot provides wrappers of Trio APIs.
Add buffering to the parser, where it only outputs actions after it’s parsed everything. This would have obvious memory usage problems.
Documentation saying “always put start_action outside of async context managers”.
API for wrapping async context managers?
nursery = trio.open_nursery()
with start_action_for_async_context_manger(nursery, action_type="myaction"):
# ... do stuff with nursery ...
I’m not sure what “Add Eliot support to Trio” would actually mean? It sounds like the data in the logs is already fully accurate; it’s just challenging to make sense of it efficiently.
If you’re using contextvars to track the eliot context, then you’re going to have situations where code is executing with its “current parent action id” pointing to an action that has already finished. Copying contexts around is what contextvars is designed to do, and it does it all the time (and this is even more true for asyncio than trio – e.g. every time you register a callback in asyncio, it copies the context). And part of the motivation for that was exactly logging use cases :-). Imagine a simple logging system that assigns each incoming RPC a unique id, and then prefixes all the subsequent log messages with that id… if the RPC handler spawns a background task, you probably want the log id to carry over into those tasks!
But it does make it tricky to figure out the last moment that someone could emit a log message that refers to a given action id… contextvars don’t have any kind of deterministic lifetime management.
I guess you could make the action id that you save in the ContextVar an full-fledged object with a custom __del__ method; when that gets invoked you know that all the contexts that were holding that id are gone. But this seems like a bad idea; doing things in __del__ is precarious under the best of circumstances, and __del__ calls can be arbitrarily delayed.
Maybe there’s some kind of clever multi-pass log parsing that could reconnect these events without needing to hold the whole thing in memory?
Maybe eliot-tree should just render the existing output better, like putting on note on the second round saying “(continued from previous event)”?
I guess we could consider extending contextvars to have some concept of lifecycle?
This is fairly intrusive and potentially adds substantial overhead, plus might generate a lot of log spam (maybe restrict it to only cases where the new task is spawned from inside an existing action?). Also there’s no asyncio equivalent. So my intuition is that eliot probably wants to do something sensible even if this isn’t set up :-). But, I can imagine that some people might be interested in it as an opt-in thing, and maybe it makes sense to provide some API to enable it, like calling eliot.log_all_tasks() or something?
I think you can accomplish the goal of the EliotNursery using the existing task_spawned and task_exited instrumentation hooks, but you might need better upstream support in order to handle nursery.start() correctly.
It’s true that the current implementation still adds unnecessary overhead for operations you haven’t hooked, but that’s just because we’re lazy and were waiting for someone to complain. It would be pretty straightforward to remove that overhead, by maintaining a seperate list of active instruments for each hook, and updating it when instruments are added and removed.