Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not acquire a read lock twice on tidyStatusLock during tidy-status #28556

Merged
merged 1 commit into from
Oct 2, 2024

Conversation

stevendpclark
Copy link
Contributor

Description

A deadlock can occur when we are reading the PKI tidy-status endpoint. This was introduced within PR28488 so no version of Vault that was released contains this issue.

The deadlock occurs when

  1. The tidy-status api is called and acquires the read lock on tidyStatusLock at the top of pathTidyStatusRead line 1689
  2. A blocking Lock call is performed against tidyStatusLock say within startTidyOperation line 999
  3. The call to b.getLastAutoTidyTime() on line 1727 attempts to re-acquire the read lock on tidyStatusLock. Normally this works as both calls are requesting read locks so it goes through but once a Write lock request is pending we will deadlock.

TODO only if you're a HashiCorp employee

  • Backport Labels: If this PR is in the ENT repo and needs to be backported, backport
    to N, N-1, and N-2, using the backport/ent/x.x.x+ent labels. If this PR is in the CE repo, you should only backport to N, using the backport/x.x.x label, not the enterprise labels.
    • If this fixes a critical security vulnerability or severity 1 bug, it will also need to be backported to the current LTS versions of Vault. To ensure this, use all available enterprise labels.
  • ENT Breakage: If this PR either 1) removes a public function OR 2) changes the signature
    of a public function, even if that change is in a CE file, double check that
    applying the patch for this PR to the ENT repo and running tests doesn't
    break any tests. Sometimes ENT only tests rely on public functions in CE
    files.
  • Jira: If this change has an associated Jira, it's referenced either
    in the PR description, commit message, or branch name.
  • RFC: If this change has an associated RFC, please link it in the description.
  • ENT PR: If this change has an associated ENT PR, please link it in the
    description. Also, make sure the changelog is in this PR, not in your ENT PR.

@stevendpclark stevendpclark added this to the 1.17.7 milestone Oct 1, 2024
@stevendpclark stevendpclark self-assigned this Oct 1, 2024
@stevendpclark stevendpclark requested a review from a team as a code owner October 1, 2024 22:32
@github-actions github-actions bot added the hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed label Oct 1, 2024
Copy link
Contributor

@raskchanky raskchanky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

github-actions bot commented Oct 1, 2024

CI Results:
All Go tests succeeded! ✅

Copy link

github-actions bot commented Oct 1, 2024

Build Results:
All builds succeeded! ✅

Copy link
Contributor

@kubawi kubawi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Steve!

@stevendpclark stevendpclark merged commit 7efc1af into main Oct 2, 2024
100 checks passed
@stevendpclark stevendpclark deleted the stevendpclark/vault-31290-pki-lock-race branch October 2, 2024 12:58
@stevendpclark
Copy link
Contributor Author

Thanks all for the reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.17.x backport/1.18.x hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed pr/no-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants