-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ProcessWorkflowTask is not stopped on worker.Stop() #1706
Comments
I am not sure we want workflow heartbeating to stop just because poller stopped. This is used by long-running local activities. Just because poller stopped doesn't mean we want to interrupt existing tasks (which can include local activity retries if the policy wants it). A worker stopping still allows all workflow and activity tasks to complete. If the local activity worker is already stopped, then yes there may be some other issue concerning not properly waiting for local activity to complete on worker shutdown. |
As I understand, there are at least three ways for the local activity task to "hang forever"
sdk-go/internal/internal_task_handlers.go Line 1366 in 9d74a90
sdk-go/internal/internal_task_pollers.go Lines 220 to 227 in 9d74a90
sdk-go/internal/internal_task_pollers.go Lines 606 to 609 in 9d74a90
Since the local activity worker is stopped first, I thought it would be fine to simply stop sending heartbeats. |
Hrmm, I suspect we may be prematurely stopping the local activity worker before its tasks are done. Let me confer with team, but I agree if there's no need to continue to do workflow task heartbeating if the local activity is not running. |
Expected Behavior
ProcessWorkflowTask
should stopActual Behavior
A WorkflowTask started by a stopped worker continues to send heartbeats, blocking further workflow execution.
Steps to Reproduce the Problem
The following code is based on the server's per-namespace worker implementation. It starts the worker, stops it, and then starts it again using the same client.
The workflow executes a series of local activities, similar to the server's Scheduler workflow.
If the worker is stopped during the execution of a local activity, we will see a "canceled" error in the logs:
However, this error is ignored by the workflow, which will continue processing the current WorkflowTask and will schedule another local activity. This activity will not be executed since the worker is stopped.
ProcessWorkflowTask
will remain active, though, and will keep sending heartbeats until either a network timeout occurs or the history size limit is reached.Heartbeats can be seen in the logs:
Specifications
The text was updated successfully, but these errors were encountered: