Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[useResponseCache] Add the computed scope in buildResponseCacheKey function params #2181

Open
k-i-k-s opened this issue Mar 19, 2024 · 8 comments
Labels
good first issue help wanted Extra attention is needed

Comments

@k-i-k-s
Copy link

k-i-k-s commented Mar 19, 2024

The problem

Currently the scope PUBLIC creates a cache key per user, since I include the session in the key string.

But I'd like to have only ONE key globally shared when it is PUBLIC, and a key per session when it is PRIVATE.

A viable solution

To be able to generate a key that is shared across all users, I'd need a property that tells the final computed scope, named scope or isPrivate.

Maybe I'm missing something, I didn't find any way to reach this goal with the current state of the plugin.
Is it something you considered or conversely that you don't want to include in the response-cache ?

To me it appears like a light solution, that could bring more flexibility when building the cache key.

This update should be reflected in @graphql-yoga/plugin-response-cache too.

@k-i-k-s k-i-k-s changed the title [response-cache] Add the computed scope in buildResponseCacheKey function params [useResponseCache] Add the computed scope in buildResponseCacheKey function params Mar 19, 2024
@n1ru4l n1ru4l added help wanted Extra attention is needed good first issue labels Mar 19, 2024
@EmrysMyrddin
Copy link
Collaborator

If I understand your problem properly, you have a custom key builder right ? And it includes the session id in the key ?

The problem is that with current implementation, giving access to the scope is probably not that easy.

The key is built very early in the pipeline, which prevents us from giving access to some things that are computed later, such as the TTL or the scope.

I have to read the code again to verify if it is doable.

@k-i-k-s
Copy link
Author

k-i-k-s commented Mar 20, 2024

Yes we have to build a custom key because of some context variables needed to fluctuate the key, the session id is one of them.

I checked the code and the key is indeed built early, the scope is computed later from the result.data.
It is still possible to get it I think considering that scopePerSchemaCoordinate already has the scopes from the schema.

The final scope could be calculated from parsing the query itself before getting the cache key, then getting the object keys in an array and passing them to the isPrivate function. Maybe memoize it to prevent later calculation with the same query.

@k-i-k-s
Copy link
Author

k-i-k-s commented Mar 20, 2024

On apollo-response-cache plugin when the scope is PUBLIC it saves only one key in cache, the parsing behaviour could replicated from their source code.

@EmrysMyrddin
Copy link
Collaborator

I have checked, I confirm that this is not possible without sacrificing performance.
Our implementation of response-caching is based on the actual response returned after the execution, which means that the cache key can't be based on the scope.

This allows us for example to entirely skip the parse/validation phases, which can be a major performance boost on medium to large requests.

On Yoga, it also allows us to take advantage of the HTTP caching mechanism, meaning we don't even have to actually consume the body of the request to send back the response on cache hit.

The scope is here only to enforce the fact that a response should never be cached if there is no sessionId with a PRIVATE scope.

@k-i-k-s
Copy link
Author

k-i-k-s commented Mar 20, 2024

The main problem we currently have with the response cache plugin is that it creates a cache entry per session even on PUBLIC scopes.

We are migrating from Apollo for the performances of Yoga and the killer feature invalidationViaMutation, but we can't afford to have a query processed by session on these PUBLIC queries.

I understand your concerns about performances issues and latency in general. A potential solution would be to memoize the query or a hash with the final scope in a Map to avoid parsing it again.

Perhaps I don't get the whole picture, tell me if so.

Anyhow know that I'd be delighted to help adding this feature !

@EmrysMyrddin
Copy link
Collaborator

EmrysMyrddin commented Mar 20, 2024

Yes that's probably the way to go.

I'm not sure how we should provide this feature.

I see tow approach:

  • The plugin itself keeps a map of operation hash => { scope, ttl } and provides this to the cache key builder.
  • The plugin adds some way to hook into the caching mechanism and provide some callback to be called with document and the derived metadata. Then, each users can decide to plug it's own logic, like in your case keep a list of public operation hash

In case of the second solution, we could change the Cache.set signature to also pass the scope to it. This way, you could just make a custom cache store implementation.

Example:

import { createInMemoryCache } from '@envelop/response-cache'

export function createCache(options) {
  const cache = createInMemoryCache()
  const publicDocumentsHash = [] // If you have a lot of different public document, a Set can be more efficient

  return {
    set(responseId, result, collectedEntities, ttl, metadata) {
      const { scope, documentString } = metadata
      if(scope === "PUBLIC"({
        if(!publicDocumentsHash.includes(hash) {
          publicDocumentsHash.push(hash)
        }
      }
      return cache.set(responseId, result, collectedEntities, ttl, metadata)
    },
    get(...args) {
      return cache.get(...args)
    },
    invalidate(...args) {
      return cache.invalidate(...args)
    },
    isPublic(documentString) {
      return publicDocumentsHash.includes(hash256(documentString))
    }
  }
}

@EmrysMyrddin
Copy link
Collaborator

@ardatan What do you think ? I tend to prefer the second solution, because it will not cause any memory penalty for users that doesn't need this. In the other hand, the boilerplate to handle this in user land is not that straight forward.

@Aloxbro
Copy link

Aloxbro commented Apr 9, 2024

@EmrysMyrddin could you review the job done by @Dunk14 ? It seems to be a working solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants