-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducing nested callbacks #60
Comments
In a real example, the callbacks would typically be input to an API responsible for the invocation of those callbacks in some way (event handlers, timer scheduling, etc), hence the need for managing their “async contexts” at all, right? For inline code, let asyncVar;
// Sets the current value [of the lexical binding] to 'top', and executes the `main` function.
asyncVar = 'top';
main();
function main() {
// [lexical scope] is maintained through other platform queueing.
setTimeout(() => {
console.log(asyncVar); // => 'top'
{
let asyncVar = 'A';
console.log(asyncVar); // => 'A'
setTimeout(() => {
console.log(asyncVar); // => 'A'
}, randomTimeout());
}
}, randomTimeout());
// [lexical scopes] can be nested.
{
let asyncVar = 'B';
console.log(asyncVar); // => 'B'
setTimeout(() => {
console.log(asyncVar); // => 'B'
}, randomTimeout());
}
} At least that’s my understanding — I could be missing something, but my impression was that “it’s about callbacks” was pretty intrinsic to the motivations for the API. |
Yes, the fully inline case is trivial and not something that we can't already do with lexically-scoped variables. However, in other cases, AsyncContexts aren't equivalent to lexically-scoped variables, in that they still behave as contexts which can bypass argument drilling for deeply nested (async and non-async) functions. And ultimately my suggestion is about preserving that ability while reducing the callbacks necessary to update the context values. Even the use cases doc provides a sync implementation for
|
We had actually discussed using |
Thank you for the link to that discussion, it was a very interesting read.
That's a valid point. It's certainly a worse kind of error to leak data between execution contexts than it is to forget to close/release some resource. However, another issue of the callback-style API lies in the use and setting of multiple async contexts at a time. My understanding of the current API is that multiple contexts can only be set at once by daisy chaining {
const ctxA = new AsyncContext.Variable();
const ctxB = new AsyncContext.Variable();
const ctxC = new AsyncContext.Variable();
ctxA.run(1, () => {
return ctxB.run(2, () => {
return ctxC.run(3, sum);
});
}); // 6
} That's a bit of an eyesore in my opinion. With {
const ctxA = new AsyncContext.Variable();
const ctxB = new AsyncContext.Variable();
const ctxC = new AsyncContext.Variable();
using _a = ctxA.set(1);
using _b = ctxB.set(2);
using _c = ctxC.set(3);
sum(); // 6
} However, in hindsight, even the discards look out of place, and as you mentioned, accidentally leaking data is too risky. An obvious workaround would be to combine those contexts into one object, but it doesn't always make sense to group data. So perhaps the better solution is to actually provide some API to namespace AsyncContext {
export declare function compose<F extends (...args: unknown[]) => unknown>(
assignments: [AsyncContext.Variable<unknown>, unknown][],
fn: F,
...args: Parameters<F>
): ReturnType<F>;
}
AsyncContext.compose(
[
[ctxA, 1],
[ctxB, 2],
[ctxC, 3],
],
sum
); // 6; Such a function has the advantage that the internal implementation could optimize away the callbacks and instead manipulate the contexts directly, leading to better memory usage. |
Imho all of this can be solved in library code which actually makes use of the async context. An example I could think of would be a middleware like to following to provide information about the request. interface TraceInfo {
traceId: string;
userId: string;
url: string;
}
const traceVar = new AsyncContext.Variable<TraceInfo>();
// export helper to get current trace info
export function getTraceInfo() {
return traceVar.get();
}
export async function traceInfoMiddleware(ctx: Context, next: NextFunction) {
// Collect tracing information
const traceId = createRandomTraceId();
const traceInfo: TraceInfo = {
traceId,
userId: ctx.user.id,
url: ctx.req.url,
};
// Run next middleware with trace info in async context now
await traceVar.run(traceInfo, next);
} That is also probably the #1 feature I'm dying to use this for once it becomes available 🚀 Edit: This could probably serve as a solution if you really need to set many variables at once: export function composeAsyncVariables<T>(cb: () => T, ...variables: [AsyncContext.Variable<any>, any][]): T {
if (variables.length === 0) {
return cb();
} else {
const [variable, value] = variables[0];
return variable.run(value, () => composeAsyncVariables(cb, ...variables.slice(1)));
}
}
await composeAsyncVariables(
next,
[var1, value1],
[var2, value2],
); |
It's worth calling out that different parts of the data a user wants stored will likely need to change at different parts of the graph. For example, in tracing we likely have a single trace id stored for a request and all descending async activity, however we would also want to store a span id from the start to the end of a particular interesting execution we want to draw a span around. A span may contain many inner behaviours. If we just stuff everything in one big object we have to reallocate a copy of other things like the trace id reference, which has not changed, every time we change the span id. We would actually prefer to have different stores for different singular purposes. We actually also have many internal things we store for internal visibility, so as it is currently we'd be doing a lot of object rebuilds per request if we tried to fit everything in one object. Whereas if we store each of our differently propagating pieces of data in their own separate stores most of them can just reuse the same store value as the outer async context rather than doing a fresh run and building a new state all the time. |
I am prototyping adding |
I understand that this was discussed in yesterday's meeting, and the decision was that it could be added later as a follow-on. I want to caution that this would likely be a very difficult follow-on to polyfill. Consider a situation in which the engine has a working Given this difficulty, it seems to me that if we think it likely that this will land eventually, then there may be good reasons to do it now rather than later. |
I totally understand your concerns. But even in such cases - where additional properties might be needed in some spans - I still believe it would be less costly to just create a clone of the state object with additional properties than running many async contexts together, considering the amount of work that the JS engine has to perform behind the scenes. |
It's definitely not less costly. I have measured this extensively at Datadog through various permutations of AsyncLocalStorage designs, including with heavy edits to V8. A trace id is set once per request. A span id is typically set a few hundred to a thousand times per request, sometimes tens of thousands in extreme cases. It's not uncommon for context to be propagated across async barriers a hundred thousand or so times per request. I have seen well over a million async transitions in a single request before. The present state of Node.js is effectively doing an object rebuild on every transition, and the performance is...not great. Doing a map clone and edit on-change in JS and then only passing through the frame reference otherwise is about 15x faster. In my experiments with ContinuationPreservedEmbedderData in V8 it can be made an order of magnitude faster with a native map that can be directly memory-cloned and only the changed slot modified. The closer that separation of differently-timed changing data lives to the metal, the faster it is. This is roughly what we've tried to achieve with the AsyncContextFrame concept used in Cloudflare and my in-progress rewrite of AsyncLocalStorage in Node.js. It spawned from a need to heavily prioritize performance of propagating data over performance of altering data, since most context data changes far less frequently than it propagates, while still having reasonable performance for the things which do change somewhat often. An AsyncContextFrame is only built when a store.run(...) happens. Otherwise, the frame value is passed through directly by-reference. By moving the separation machinery to a lower level rather than tracking it at object level, it enabled the optimization of propagation by-reference while at the same time also making it even more desirable to have separate stores as it reduced the frequency that data needed to change for the context management to flow. It's not as optimal as it could be as implemented in JS it can't do the memory-cloned map optimization, but even with rebuilding the map it still managed to be much faster than the object clone version. |
An interesting case is generators: function* gen() {
using _ = storage.disposable(value);
yield 1;
// what context is restored here?
} Of course we'd want |
That's @Qard, that's some very interesting insights which I didn't expect. |
@jridgewell I don't understand your question; why would there be any mutation? |
Because I'm still within the same scope, the |
Related: tc39/proposal-using-enforcement#1 |
I think avoiding nesting can be solved without explicit resource management if we're willing to require some setup for the functions that want to switch within their invocation. const varA = new AsyncContext.Variable();
const varB = new AsyncContext.Variable();
const doStuff = AsyncContext.bindVariables((setVariable, foo) => {
setVariable(varA, 123);
setVariable(varB, foo);
someOtherCall();
});
doStuff("foo"); |
I think that achieves the same 1-layer-nesting that runAll does. |
Right, but with the additional ability to conditionally switch variables during the execution of the function. Not sure if that's a use case, to be honest I'm fuzzy on the motivations. |
It's not clear to me that To my mind, as long as you own the context, there's not really anything too troubling about mutating a variable within that context. The problem with the disposable approach is that you may be in a shared context and thus leak your mutations somewhere unexpected. A simple userland workaround if you happen to own the variable is to add an extra level of indirection: use a @mhofman's suggestion of exposing a setter that only works in your owned context seems like a good compromise, though I suspect the API could be less confusing as a "run" rather than a "bind" (but maybe there's a performance reason I'm unaware of where the bind approach makes more sense?): const result = v.runWithSetter((setV, boundArg1, boundArg2) => {
// do stuff
setV(newValue);
// do more stuff
return someResult;
}, arg1, arg2); Or the batched approach const result = AsyncContext.runWithSetter((setVariable, ...boundArgs) => {
setVariable(varA, 123);
setVariable(varB, 456);
// do stuff
setVariable(varA, 789);
// do more stuff
return someResult;
}, ...passedArgs); I could also imagine a method or function decorator doing this a bit more automatically (though the changed signature might be a dealbreaker depending on how TypeScript ultimately handles that): @bindVariables
function(setVariable, arg1, arg2) { /* ... */ } But I agree that without a better understanding of the use case, it's hard to justify. Different approaches to solving this are easier/harder to polyfill, particularly in an incremental way. For what it's worth, I'm inclined to strictly enforce wrapping |
See #60 (comment) for the authors intention.
I don't understand why moving mutation into the callback is a better choice than having them grouped outside the callback with If both |
v.run(1, () => {
const a = doSomething();
v.run(2, () => {
const b = doSomethingElse();
v.run(3, () => {
putItAllTogether(a, b);
});
});
}); vs. runWithSetter(set, () => {
set(v, 1);
const a = doSomething();
set(v, 2);
const b = doSomethingElse();
set(v, 3);
putItAllTogether(a, b);
}); This latter use case is not possible in userland unless you have control over the variable and are willing to add an extra layer of indirection, which likely also hurts usability. One might argue that if you don't own the variable, you shouldn't be able to mutate it - but in this case, you do own the scope in which you're mutating it, and that's at least something. It satisfies the condition that you can't affect your caller's scope, but it does have the potential downside that functions you call can no longer rely on variables being immutable across async continuations (i.e. the values can change out from under them), which may still be a concern. |
This is my first time going through this proposal, and as I was looking at the following example, I couldn't help but feel like I was in callback hell again:
I understand that this is probably not a practical example, but regardless, the proposal only offers a callback-style API. So it got me thinking about how we could flatten these callbacks, and I realized that there is an opportunity to interoperate with the Explicit Resource Management proposal. Specifically,
AsyncContext.Variable
instances could also have aset()
method which returns a Promise that resolves to a disposable resource when the context value is updated, and upon disposing the resource, the context is changed back.In the case of the example above, we would be able to clean it up like so:
To be clear, the lifetime of an AsyncContext value using
set()
isn't quite semantically equivalent to usingrun()
, but this idea is more about readability and exposing interoperability with other patterns.It's a pretty nitpicky suggestion, but I'd be interested in hearing what other people think about it.
The text was updated successfully, but these errors were encountered: