You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The demux snapshotter utilizes a snapshotter caching mechanism for funneling requests to the appropriate remote snapshotter.
This cache enables two things. One, we use it for performance reasons. Creating the proxy object can be an expensive operation. Two, the cache backs service discovery for metrics proxy.
Snapshotter requests can occur in parallel. So we need to protect memory. With the existing implementation, perform the following operations:
Acquire reader's lock.
Fetch snapshotter from cache.
Release reader's lock.
If cache hit, done.
If cache miss, acquire writer's lock.
Fetch snapshotter from cache.
if cache hit, jump to 10.
If cache miss, continue.
Create cache entry using fetch function.
Release writer's lock.
We utilize the double check lock to ensure no system resources are leaked if two threads populate the cache entry concurrently.
e.g.
Thread A - acquire reader's lock, cache miss, release reader's lock, and context switched.
Thread B - acquire reader's lock, cache miss, release reader's lock, and context switched.
Note: at this point both threads will have had a cache miss and are on course to populate the cache.
Thread A - acquire writer's lock, populate cache entry, release writer's lock.
Thread B - acquire writer's lock, populate cache entry, release writer's lock.
Note: at this point the cache entry from Thread A is leaked. While garbage collection will resolve the object itself, these entries are used to manage system resources which enable metrics proxy. In this case, a system port where the metrics proxy HTTP server is running.
Challenge:
The issue is the writer's lock is held during cache entry fetch which we have observed can be an expensive operation on some systems. The ideal solution would be to release the lock after a writer's lock cache miss; however, we must be cognizant of the above scenario and avoid leaking resources.
The text was updated successfully, but these errors were encountered:
If we are to stick to the current mechanisms with a single RWMutex and map, then the solution may look like this.
// Get fetches and caches the snapshotter for a given key.func (cache*SnapshotterCache) Get(ctx context.Context, keystring, fetchSnapshotterProvider) (*proxy.RemoteSnapshotter, error) {
cache.mutex.RLock()
snapshotter, ok:=cache.snapshotters[key]
cache.mutex.RUnlock()
if!ok {
newSnapshotter, err:=fetch(ctx, key)
iferr!=nil {
returnnil, err
}
cache.mutex.Lock()
snapshotter, ok=cache.snapshotters[key]
defercache.mutex.Unlock()
if!ok {
cache.snapshotters[key] =newSnapshottersnapshotter=newSnapshotter
} else {
newSnapshotter.Close()
}
}
returnsnapshotter, nil
}
Although I am the author and even I admit it looks somewhat ugly. Could potentially be improved by breaking out the separate pieces of functionality into named functions. I also have a cache refactor out for review which reworks having the fetch function as a required instance variable which will eliminate the need to pass it through.
// Get fetches and caches the snapshotter for a given key.func (cache*SnapshotterCache) Get(ctx context.Context, keystring, fetchSnapshotterProvider) (*proxy.RemoteSnapshotter, error) {
cache.mutex.RLock()
snapshotter, ok:=cache.snapshotters[key]
cache.mutex.RUnlock()
if!ok {
varerrerrorifsnapshotter, err=cache.pullThrough(ctx, key, fetch); err!=nil {
returnnil, err
}
}
returnsnapshotter, nil
}
func (cache*SnapshotterCache) pullThrough(ctx context.Context, keystring, pullSnapshotterProvider) (*proxy.RemoteSnapshotter, error) {
snapshotter, err:=pull(ctx, key)
iferr!=nil {
returnnil, err
}
cache.mutex.Lock()
defercache.mutex.Unlock()
ifs, ok:=cache.snapshotters[key]; ok {
// Entry pulled through by another thread. Cleanup resouces allocated.snapshotter.Close()
returns, nil
}
cache.snapshotters[key] =snapshotterreturnsnapshotter, nil
}
Context:
The demux snapshotter utilizes a snapshotter caching mechanism for funneling requests to the appropriate remote snapshotter.
This cache enables two things. One, we use it for performance reasons. Creating the proxy object can be an expensive operation. Two, the cache backs service discovery for metrics proxy.
Snapshotter requests can occur in parallel. So we need to protect memory. With the existing implementation, perform the following operations:
We utilize the double check lock to ensure no system resources are leaked if two threads populate the cache entry concurrently.
e.g.
Thread A - acquire reader's lock, cache miss, release reader's lock, and context switched.
Thread B - acquire reader's lock, cache miss, release reader's lock, and context switched.
Note: at this point both threads will have had a cache miss and are on course to populate the cache.
Thread A - acquire writer's lock, populate cache entry, release writer's lock.
Thread B - acquire writer's lock, populate cache entry, release writer's lock.
Note: at this point the cache entry from Thread A is leaked. While garbage collection will resolve the object itself, these entries are used to manage system resources which enable metrics proxy. In this case, a system port where the metrics proxy HTTP server is running.
Challenge:
The issue is the writer's lock is held during cache entry fetch which we have observed can be an expensive operation on some systems. The ideal solution would be to release the lock after a writer's lock cache miss; however, we must be cognizant of the above scenario and avoid leaking resources.
The text was updated successfully, but these errors were encountered: