Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: return cached data for a known type/id #155

Open
dustinfarris opened this issue Dec 30, 2016 · 13 comments
Open

RFC: return cached data for a known type/id #155

dustinfarris opened this issue Dec 30, 2016 · 13 comments

Comments

@dustinfarris
Copy link
Contributor

dustinfarris commented Dec 30, 2016

UPDATED 1/3/17

The prefetch hook described below is not in-line with the desired style of Cashay. See discussion and tl;dr in comments.

DISCLAIMER

I am still learning about cashay implementation details and it's possible one or more of my assumptions are wrong. Please push back.

Summary

Add a hook to allow users to access the cache while a query is in-flight.

Reason

When the user requests data, it is possible that some or all of the data already exists in cashay's cache. It would be beneficial from a performance standpoint to utilize that cache.

Note: the goal is not to reduce network traffic, but to display the data we already have, if we have it.

Cashay does this already under certain circumstances:

  • when it recognizes the Operation/shape of the query
  • when the user adds the @cached decorator

There is an additional scenario, however, not covered by these—when the user is requesting a GraphQL field subset for which partial data already exists in the cache.

Consider this query from a project detail page:

query {
  project(id: $project_id) {
    id
    name
    todos {
      id
      name
    }
  }
}

The project detail gets all todos under it, selecting their individual ids and names. Now suppose the user clicks a specific todo, and is taken to a todo detail page which might have the following query:

query {
  todo(id: $todo_id) {
    id
    name
    description
    assignee {
      id
      name
    }
  }
}

Cashay sees this as a totally new query, yet some of the data is already in the cache from the project page! Namely, the todo's id and name.

The @cached decorator cannot be used here for two reasons:

  1. @cached assumes the entire shape can be found in cache and kicks back null for the pieces it doesn't find
  2. If the user enters the todo detail route directly, the cache does not exist yet at all. @cached provides no fallback behavior if an id is not found.

Implementation

Add a new hook for cashay.query that gives the user access to the cache. In my mind, this will behave similar to the way mutationHandlers and "optimisticVariables" work currently, with the following flow:

  1. user executes a query
  2. cashay prepares an initial result as it currently does
  3. cashay executes a "preFetch" hook, providing the initial result and access to the cache
  4. user modifies the initial result using the cache
  5. cashay continues to execute a server query in the usual way
  6. (bonus) cashay executes a "postFetch" hook and returns the final result

Pseudo (using the todo query above as todoQuery):

const todoQuery = `
query {
  todo(id: $todo_id) {
    id
    name
    description
    assignee {
      id
      name
    }
  }
}
`;

const stateToComputed = (state, { todo_id }) => {

  const { data: { todo }, status } = cashay.query(todoQuery, {

    variables: { todo_id },

    preFetch(initialResult, cache) {
      const cached_todo = cache.find('Todo', todo_id);
      if (cached_todo) {
        initialResult.todo = cached_todo;
        return initialResult;
      }
    },

    postFetch(finalResult, serverResponse) {
      // possible manipulation here, not strictly necessary
    }

  });

  return { todo, isLoading: status === 'loading' };
}

const TodoDetail = Component.extend({
  layout: hbs`

  <h1>{{todo.name}}</h1>
  <hr>
  {{#if isLoading}}
    <em>Loading...</em>
  {{else}}
    <div>Description: {{todo.description}}</div>
    <div>Assignee: {{todo.assignee.name}}</div>
  {{/if}}
  
  `
});

With such an API, the component will be able to display the name of the todo, while the rest of the content is loading.

Drawbacks

This opens the door for possible cache misuse and anti-pattern.

Alternatives

Modify @cached instead

Potentially, @cached could be modified to implement the same flow above, where it returns cached data, but fires a server request anyway if the cached data is not found. This would need to account for missing subfields—e.g. if the entire requested shape does not exist in cache, the server must be consulted.


additional context in #154

@dustinfarris dustinfarris changed the title query fromCache hook RFC: query preFetch hook Dec 31, 2016
@dustinfarris
Copy link
Contributor Author

Updated title and description. /cc @mattkrick @jordanh

@jordanh
Copy link
Contributor

jordanh commented Jan 2, 2017

Intriguing proposal @dustinfarris. A suggested name for such a feature "progressive querying." I'll let met respond with his thoughts before adding mine...

@mattkrick
Copy link
Owner

I think the goal of cashay is to have a declarative API (except for mutations, but that's for v2). In other words, you should tell it what you want to do, not how to do it. Declaratively, you want data. You shouldn't have to sorry how that's achieved. Unfortunately, to do that means we'd have to make certain assumptions about the user defined queries on the server (like how they're named), or we'd have to be more verbose.

@dustinfarris
Copy link
Contributor Author

dustinfarris commented Jan 3, 2017

@mattkrick ok, if adding lowish-level hooks are a no-go (à la v2), then I think my alternative proposal should be considered, perhaps in a separate issue.

In a nutshell: modify existing @cached to be smarter about existing data and allow the user to specify fallback behavior in case some or all of the data is missing.

Something like:

query {
  todo @cached(id: $todo_id, type: "Todo", refresh: true) {
    id
    name
    assignee {
      id
      name
    }
  }
}

☝️ adding "refresh: true" instructs Cashay to "give me whatever you have for the resolved id, but kick off a server request and update the information". To your point, @mattkrick, this assumes that the name of the field, todo in this case, matches the name of the server-side field to be queried.

From my understanding, current @cached allows you to name the base field whatever you want, e.g.

getTodoFromCache @cached (id: $todo_id, type: "Todo") {

so it would be necessary to either make this illegal (doubt anyone would go for that), or require the user to specify the server query field explicitly if it is not the same, e.g.

getTodoFromCacheAndServer @cached (id: $todo_id, type: "Todo", refresh: true, field: "todo") {

in the absence of field:, it is assumed that the field name is the same

todo @cached (id: $todo_id, type: "Todo", refresh: true) {

@mattkrick
Copy link
Owner

if we know the name of the query, then do we even need to use a directive? Why not just make it a standard query & then diff out the fields that we already have? If we have all the fields, then we don't send a request to the server.

@dustinfarris
Copy link
Contributor Author

Yep, that works too.

The goal, though, is to use data that potentially came from other queries—you mentioned in another issue something about not making assumptions about type/id uniqueness. I'm still fuzzy about what you meant by that.

Yes, ideally I would love for Cashay to "just do it" when I write,

query {
  todo(id: $todo_id) {
    id
    name
    description
    assignee {
      id
      name
    }
  }
}

☝️

  • look up todo on client-safe schema
  • note that it returns type "Todo", use that to search cache for Todo with todo_id
  • return any requested subfields it already has in cache
  • kick off a server request for either the whole thing (full replacement) or just the subfields that it didn't have—not sure best strategy but it's a win for me regardless

@mattkrick
Copy link
Owner

here are the embedded assumptions in that:

  • calling todo with the same args will return the same result
  • calling todo will return a document
  • calling todo will return a document with an ID that is unique among all todos

These assumptions are all probably safe, heck relay makes these assumptions & that's why you write relay queries in such a funny way... the relay server basically has an id that is a union of all possible types. Their GUID is a base64 composite key of the id & the type, so you don't even need to repeatedly write a bunch of getXById queries, the helper on the server writes those for you. Unfortunately, this assumes you have access to change the server. That server could be set up by another person/department/company, which means you don't have the right to change it.

@dustinfarris
Copy link
Contributor Author

The difference here is that Cashay knows in advance what Type will be returned. Relay and others do not. In my view, this is a clear differentiator that makes those assumptions plausible, without any goofy GUIDs. Server access not required, only the schema.

@mattkrick
Copy link
Owner

definitely, i hear ya, it's a solid proposal. what would you propose to do about arrays of todos? For example let's say on 1 page you have a getTop50Todos. Then, on another page you wanna query 5 of those. would you have 1 query that takes in an array of ids or would you call the same todo 5 times?

@dustinfarris
Copy link
Contributor Author

Interesting question. My gut reaction is the (optional) list of ids.

type Root {
  todos(ids: [ID]): [Todo]
}
query FiveTodos($todo_ids: [ID]) {
  todos(ids: $todo_ids) {
    id
    name
    description
  }
}

Aside: There are other caveats such as interfaces and unions where it may not be easy to determine to exact Type. I'd say the hard-and-fast rule should be: "we will try, but if we can't resolve the Type and id, we're going to the server". We can potentially add more tools here in the future, but I think sticking to known type+id is a good starting point.

@dustinfarris
Copy link
Contributor Author

"other tools" might include a way to specify which Types have UUIDs when cashay is initialized.

e.g.

cashay.create({
  types_with_uuid: ["Todo", "Project"]

and then use that to decide whether uniqueness can be inferred from a particular Union query. If so, great. If not, existing operation/key in cache? Ok, still great. Otherwise, server.

But I digress...

@mattkrick
Copy link
Owner

@cached already handles unions, although i'm not sure how well...works for me at least 😄

@dustinfarris
Copy link
Contributor Author

dustinfarris commented Jan 3, 2017

tl;dr the abbreviated proposal is:


If the Object Type and ID of a query can be determined in advance, Cashay will attempt to find existing data in cache, and backfill missing subfields with a server request.


@dustinfarris dustinfarris changed the title RFC: query preFetch hook RFC: return cached data for a known type/id Jan 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants