-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API review, questions, brainstorming #298
Comments
Thanks for your inputs, @zolkis ! I have some initial comments regarding to sync/async and implementation. From the spec:
It's true for current spec. However, the WG agreed to support async context creation. There are two pending PRs (#274 and #285) that add the async context creation method. They are pending on the TAG review on the naming.
CPU context also has async computeAsync() method. As the spec says "Asynchronously carries out the computational workload of a compiled graph MLGraph on a separate timeline, either on a worker thread for the CPU execution, or on a GPU timeline for the submission of GPU workload on the command queue. " From the workflow:
ditto, one can execute (compute) a graph synchronously (CPU, GPU). From the implementation:
That just reflects the current implementation. The MLContext should be implemented by native ML API, for example to check whether a required device, such as GPU, is capable to support the WebNN graph execution.
No. There is no custom operator support in current WebNN spec. There was related discussion at #6 . |
Thanks for the replies. Here is a second round of questions/arguments about the following topics. Please bear with me. :) ContextIt seems the ML frameworks do without it, but I agree it might be future proof thinking to introduce the notion of context, comprising the HW abstraction, resources needed, etc. So we can create contexts in sync or async way. [SecureContext, Exposed=(Window, DedicatedWorker)]
interface ML {
Promise<MLContext> createContext(optional MLContextOptions options = {});
};
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLContext {
constructor(optional MLContextOptions options = {});
readonly attribute MLDeviceType deviceType;
readonly attribute MLPowerPreference powerPreference;
}; Also, it might make sense to expose the content of the context instead of internal slots, for testing, clarity, etc. There might be other internal slots of context that we don't want to expose, but these data were provided by the script. CPU vs GPU contextThe CPU and GPU contexts are rather different. Currently the differentiation is going to be in the algorithms. That is both obscure and a complication for algorithms and testing alike. Builder vs contextI needed multiple re-reads to rethink the developer use cases here. [SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLGraphBuilder {
// Construct the graph builder without context.
constructor();
// ... include all the current functions, except the following
}; We use the partial interface MLGraphBuilder {
// Compile the graph up to the specified output operands asynchronously.
Promise<MLGraph> build(MLContext context, MLNamedOperands outputs);
// Compile the graph up to the specified output operands synchronously.
MLGraph buildSync(MLContext context, MLNamedOperands outputs);
}; Graphs, encode, computeWe have arrived to the most important interface IMHO, the [SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLGraph {
readonly attribute MLContext context;
Promise<undefined> compute(MLNamedArrayBufferViews inputs, MLNamedArrayBufferViews outputs);
undefined computeSync(MLNamedArrayBufferViews inputs, MLNamedArrayBufferViews outputs);
}; Command encoderIn addition, we have the command encoder for GPU contexts.
Since an encoder is first tied to a graph, which is tied to a context, I would start with option 2, write the algorithm, see if it's good enough, then explore the other options. I think with these changes the examples would become more intuitive. I need to check if there are implementation blockers (this comment may be updated depending on the findings). |
Additional arguments came up in a discussion with @huningxin and @anssiko. Builder vs contextIn #149, it was discussed if builders should be context agnostic before compiling to e.g. CPU or GPU. A counterargument was that there might already be some preload happening already in the build phase, and it would be inefficient to undo that later - therefore the builder should be early-bound to context. It is not clear whether could this preload be postponed until a later point. The first argument is a developer use case: being able to (efficiently) reuse a pre-compilation-graph to be compiled for different contexts. I think both should be enabled by design. So while early-binding context to builder, it should be easy to reuse programmatically built structures, typically probably migrating CPU to GPU. That could be achieved in a number of ways, e.g.
I think the current way is good enough (perhaps even the best, since avoids complex transforms), so we don't necessarily need to change the API, but would be interested what people think today. It would be certainly more generic approach to explicitly expose boundaries in the typical developer flow, described in this comment.
In any case, we should include an example on how to solve this developer use case, and mention that implementations may use early optimizations in the build process. |
Another argument in #149 is whether a context was created using defaults, or explicit options. Context creation: implicit vs explicitThis comment states,
This is a fundamental difference in behavior, which normally would warrant defining separate, easily identifiable algorithms, or even separate interfaces. I think we can keep one interface, but need a way to cleanly separate these behaviors in the spec, without creating confusion. We could make referenceable definitions for the two classes/behaviors of context types. To keep changes minimal, I propose introducing an enum MLDeviceType {
"auto",
"cpu",
"gpu"
}; Note that this aligns well with MLDevicePreference from the Model Loader API. I think this would clarify (make explicit) the user choice better, and would make the spec more concise as well. |
At the moment it is a speculation, but based on what I read so far, it seems to me that an One version of a minimal Web IDL that would enable this could be the following (not a proposal at this point, only an illustration to discussion). interface mixin NavigatorML {
[SecureContext, SameObject] readonly attribute ML ml;
};
Navigator includes NavigatorML;
WorkerNavigator includes NavigatorML;
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface ML {
Promise<MLContext> createContext(optional MLContextOptions options = {});
Promise<MLContext> createContext(GPUDevice gpuDevice);
};
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLContext {
constructor(optional MLContextOptions options = {});
readonly attribute MLContextOptions options;
// graph is an internal slot (eventually in different forms during lifecycle)
// lifecycle state is an internal slot, e.g. "building", "compiled", "encoding", "encoded", "compute" etc.
readonly attribute MLGraphBuilder builder; // dedicated builder operates on internal slots
Promise<undefined> build(MLNamedOperands outputs); // async version split out from MLGraphBuilder
Promise<MLNamedArrayBufferViews> compute(MLNamedArrayBufferViews inputs);
};
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLContextGPU extends MLContext { };
partial interface MLContextGPU {
constructor(GPUDevice gpuDevice);
// the following deal with command encoding, using internal slots
Promise<undefined> preprocess(); // former initializeGraph
Promise<MLNamedGPUResources> dispatch(MLNamedGPUResources inputs);
Promise<GPUCommandBuffer> encode(optional GPUCommandBufferDescriptor descriptor = {});
}; Alternatively, we can fuse into a single MLContext, and expose an MLCommandEncoder interface attribute that can be |
TODO (placeholder): check FW use cases
|
TL;DR (edited after discussion with @huningxin) Proposal for exposing contextThis proposal would simplify the discussion in #257, #255, #162. It provides the context type as a single descriptor for the combination of resources used in the context, e.g. a valid combination of device(s), power preference etc. Enables simple and more intuitive differentiation in algorithms between script-controlled CPU, GPU and UA-managed (Web GPU) contexts. In addition, enables future use of hybrid contexts (e.g. CPU+accelerator) as well. enum MLContextType {
"cpu", // script-controlled context
"gpu" // script-controlled context
"webgpu", // managed by the user agent
// later other context types may be defined, even using multiple devices, e.g. "cpu+npu" etc.
// Note: in fact all these context types could be separate interface classes as well...
};
enum MLPowerPreference { // a hint
"default",
"high-performance",
"low-power"
};
dictionary MLContextOptions { // not a hint
MLContextType contextType = "cpu";
MLPowerPreference powerPreference = "default";
GPUDevice? gpuDevice = null;
};
[SecureContext, Exposed=(Window, DedicatedWorker)]
interface ML {
Promise<MLContext> createContext(optional MLContextOptions options);
[Exposed=(DedicatedWorker)]
MLContext createContextSync(optional MLContextOptions options);
// Internal slots
// [[boolean managed]] // `true` if the user agent controls the context (not neeeded)
// [[MLContextType contextType]]
// [[MLPowerPreference powerPreference]]
// [[implementation]] // perhaps "adapter" would be better
// further methods (and eventually properties) will follow
}; So far, not much change, except:
|
The spec contains certain constraints that is hard to describe and enforce via algorithms. For instance, the note from the [MLContext section(https://webmachinelearning.github.io/webnn/#api-mlcontext):
Or, from the MLCommandEncoder section,
To achieve that, there should be a possibility of binding MLGraph and MLCommandEncoder to an MLContext of type "webgpu". Therefore I'd add an internal slot [[model]] to MLContext that represents a compute graph bound to a context. If that context is of type "webgpu", then it will have MLCommandEncoder-specific initialization, dispatch and finish (e.g. the MLCommandEncoder interface could be exposed in the context as an attribute). Also, the discussion in #149 reveals a possible use case of discerning between a compute graph (as built by a builder, that could be executed in a multiple contexts) and a graph that is initialized for a given context for execution. In summary: a builder's output, i.e. an MLGraph is always bound to an MLContext, so it could as well be (part of ) an internal slot of MLContext, and also the builder could be an attribute of MLContext. Proposal: include graph builder, command encoder as attributes to
|
Closing this issue, since the conclusions are followed up in separate issues. |
As a newcomer with fresh/naive eyes, I use this issue to summarize my thoughts, from the perspective of defining future algorithms. I tried to draft some algorithms, and bumped into how context, operators, graph and internal slots are to be exposed and referenced in the spec.
This is a "questions" category issue, no fix is required, please just bear with me and explain stuff, as I've been missing prior discussions. If needed, targeted issues can be spawned.
From the spec:
Internal slots are used for storing device type and power prefs as string enums.
as well as a sync factory method to create a command encoder.
From the explainer,
Basically the main workflows that are covered:
(though in theory it could be also built based on a JSON or JSON-LD description as well, I do very well understand why a programmatic approach is more practical here).
To me it seems use cases are revolving around graphs, which leads to the premise they could be the primary interface exposed by the spec.
Mainly first we construct them, and there I am not sure if context is relevant, is this a syntactic construction, or some context dependent optimizations already supposed to happen by specification during the build?
I am checking the implementation to figure out things, but for now I have some questions:
Question: could one bind a user function to it, that overrides the default one?
(This post may be edited/reformulated in the future).
The text was updated successfully, but these errors were encountered: