-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent order of MessageChannel messages and task (race conditions) #133
Comments
The timing on here is... difficult. The communication between the worker and the parent is based on MessageChannel and MessagePort... which generally happens synchronously but relies on signaling via the event loop to determine when the actual message notification happens. Signaling the completion of the work is also initially done by message passing but there the notification of "done" occurs via the microtask queue, which is being drained independently of the event loop. There are quite a few factors involved and the timing is exceedingly difficult to reason about. @addaleax may have some ideas here. |
This is how it looks in Node.js without piscina. I was thinking about using a second MessageChannel for signaling but it looks like there are no guarantees regarding the order. This was unexpected but also kind of makes sense. index.js const path = require('path');
const { Worker, MessageChannel } = require('worker_threads');
const { port1, port2 } = new MessageChannel();
const worker = new Worker(path.join(__dirname, 'worker.js'));
worker.on('message', (message) => {
console.log(message);
});
port1.on('message', (message) => {
console.log(message);
});
worker.on('error', (err) => {
console.error(err);
});
worker.on('exit', (code) => {
process.exit(code);
});
worker.postMessage(port2, [port2]); worker.js const { parentPort } = require('worker_threads');
parentPort.once('message', (port) => {
port.postMessage('A1');
parentPort.postMessage('B1');
port.postMessage('A2');
parentPort.postMessage('B2');
}); Expected (naively):
Actual (but depends on size of messages):
I assume this is because in the event loop iteration each message channel has a queue and the message are processed without considering the order of other channels. I can accept that. Interestingly enough the So here's a possible solution for piscina: allow user-land messages to be passed over the same port that piscina is using. But transparently of course. Assuming that the order of In the worker // A special `parentPort` that allows sending messages through the same port that piscina uses internally.
const { parentPort } = require('piscina'); Now the question is how this could look on the other end. Could the promise returned from |
Though this can provide a sense of order, the truth is that it won't guarantee to be deterministic.
That will complicate implementation and make things more entangled, separated API will be better. Though, just to understand your issue. Here you're trying to communicate 1:1 with the workers by not directly providing a task, but rather broadcasting a message? |
I'm streaming results from the task to the client via Websockets. While the task is running I keep sending batches of |
Yeah, have a sense. Do you have a suggestion or idea of how can the API look like? |
I want to point out that what I'm doing is purely unidirectional. The worker is sending data (or task progress) to the main thread. There is no communication in the other direction. The official "progress" example (https://github.com/piscinajs/piscina/blob/current/examples/progress/index.js) likely suffers from the same issue. It can call
I'm sick rn and can't put any more thoughts into that. The issue could be solved without any API changed if there was a way to only make the task resolve after all |
Using the |
This issue has been marked as stale because it has been opened 30 days without activity. Remove stale label or comment or this will be closed in 5 days. |
Not stale and still a valid feature request |
This issue has been marked as stale because it has been opened 30 days without activity. Remove stale label or comment or this will be closed in 5 days. |
Not stale and still a valid feature request |
This issue has been marked as stale because it has been opened 30 days without activity. Remove stale label or comment or this will be closed in 5 days. |
Not stale and still a valid feature request |
This issue has been marked as stale because it has been opened 30 days without activity. Remove stale label or comment or this will be closed in 5 days. |
Not stale and still a valid feature request |
Piscina 3.0.0 (2.2.0 as well), Node.js 14.17.0 and 16.1.0
In the official example of using MessageChannel the order is not consistent. The task can finish before the
message
event or the other way round. I was naively expecting everypostMessage
inside the worker to appear before as amessage
event beforerunTask
finishes. But the order is seemingly random causing funny side effects on my end.Here's the code adapted from the official example. I haven't seen it ever happen on the first iteration, but often the second (sometimes a hand full of iterations) will stop because the task resolved before the message event happened.
Expected order: [message, done]
What I see often: [done, message]
index.js
worker.js
Now it gets more interesting with multiple message, let's send two messages from the worker:
Expected order: [message, message, done]
What I see often: [done, message, message] or even [message, done, message]
index.js
worker.js
With such a small message you usually get [done, message, message] with the second iteration. With a larger message you can get [message, done, message] as well and it often takes more than two iterations.
E.g. here's a run that just took more than two iterations:
If this is expected behavior on Node.js's end I wonder if Piscina could do something about that? I'm not really into implementing flow control with ACK messages and what not when that's exactly why I'm using Piscina in the first place (native worker threads are somewhat painful to use 😅 ).
The text was updated successfully, but these errors were encountered: