Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TestWorkflowTaskRedirectInRetryFirstTask flake #6189

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

carlydf
Copy link
Contributor

@carlydf carlydf commented Jun 22, 2024

What changed?

  • Wait for build id search attribute v1.1 to land in visibility before validating entire build id search attribute structure (this solves the observed flake)
  • Wait for build id search attribute v1.2 before validating it (not sure if this would flake, but it could)
  • Add comment explaining that we should wait before validating

Why?

Here's a run with the latest flake: https://github.com/temporalio/temporal/actions/runs/9575636250/job/26400905169
We need to make sure that the build ID search attribute has enough time to update before we validate all the parts of it.

How did you test it?

Tested locally, testing in CI now

Potential risks

None

Documentation

Commented function

Is hotfix candidate?

No

@carlydf carlydf requested a review from a team as a code owner June 22, 2024 00:06
@@ -5022,6 +5024,13 @@ func (s *VersioningIntegSuite) waitForWorkflowBuildId(
)
}

// validateWorkflowBuildIds gets the specified workflow execution from visibility and validates that the BuildIds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think DescribeWorkflowExecution make visibility calls. it loads the execution, including its SAs, from the DB. so this comment is not accurate.

// - expectedBuildId: the currently-assigned build id for the workflow execution (only present in newVersioning)
// - expectedStampBuildId: the most recent version of a worker that processed a task for that workflow execution
//
// If expectedBuildId != "" and newVersioning == true, waitForWorkflowBuildId should be called before this to prevent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I'm calling this waitForWorkflowBuildId in firstWorkflowTaskAssignmentSpooled is that I'm checking wf build ID before the task is dispatched (when worker is not running). That relies on the history task processing timing because MS build ID is updated after the transfer task is sent to Matching and it is backlogged.

MS build ID is also updated when the task is dispatched (when record*TaskStarted is called). So in tests that wait for task dispatch via other means like waiting for the timedoutTask channel for example, calling waitForWorkflowBuildId should not add any value. Still I see I called it in firstWorkflowTaskAssignmentSyncMatch and testWorkflowTaskRedirectInRetry that already are waiting for channels. Not sure if I had a reason or it was a mistake. Wish I had put a note there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants