Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fetch snapshotter proxy object without holding cache lock #685

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions snapshotter/demux/cache/snapshotter_cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,16 +43,18 @@ func (cache *SnapshotterCache) Get(ctx context.Context, key string, fetch Snapsh

if !ok {
cache.mutex.Lock()
defer cache.mutex.Unlock()

snapshotter, ok = cache.snapshotters[key]
cache.mutex.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use a reader lock above. Should we be doing that here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof that adds a new level of complexity if we need to account for a second double check lock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. If !ok, wouldn't cache.snapshotters[key] be always nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kzys, unless another thread populates the cache after we have a cache miss but before we acquire the writers lock. The cache entry needs to be .Close() before it is garbage collected to cleanup system resources for metrics proxy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But @ginglis13 is correct this solution won't work for edge cases because we can leak now. So it requires fetch after reader's lock cache miss but before writer's lock acquisition.

Copy link
Contributor

@kzys kzys Jun 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving cache.mutex.RUnlock() in this if block? The snapshotter doesn't do much time consuming operations between RUnlock and Lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may result in deadlock. We'd block on Lock without releasing RUnlock. I don't believe RWMutex has brains to know assign writer's lock even if only one reader's lock is acquired by the same thread.


if !ok {
newSnapshotter, err := fetch(ctx, key)
if err != nil {
return nil, err
}

cache.mutex.Lock()
cache.snapshotters[key] = newSnapshotter
cache.mutex.Unlock()
snapshotter = newSnapshotter
}
}
Expand Down