Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding the right Batching Mechanism #25

Open
michaelstaib opened this issue Mar 7, 2024 · 7 comments
Open

Finding the right Batching Mechanism #25

michaelstaib opened this issue Mar 7, 2024 · 7 comments

Comments

@michaelstaib
Copy link
Member

Batching Mechanisms for Distributed Executors

To implement efficient distributed executors for composite schemas, we need robust batching mechanisms. While introducing explicit batching fields for fetching entities by keys is a straightforward approach, it becomes challenging when entities have data dependencies on other schemas.

Consider the following GraphQL schema:

type Query {
  orderById(id: ID!): Order

  # batching field
  ordersById(ids: [ID!]!): [Order]!
}

The issue arises with directives like @require for lower-level fields, where simple batching is insufficient for data dependencies.

Example Scenario:

Source Schema 1:

type Query {
    orderById(id: ID!): Order
    ordersById(ids: [ID!]!): [Order]!
}

type Order {
    id: ID!
    deliveryEstimate(dimension: ProductDimensionInput! @require(fields: "product { dimension }")) : Int!
}

Source Schema 2:

type Query {
  orderById(id: ID!): Order
  ordersById(ids: [ID!]!): [Order]!
}

type Order {
  id: ID!
  product: Product
}

In distributed executor queries, batching individual requirements for each key becomes problematic:

query($ids: [ID!]! $requirement: ProductDimensionInput!)  { # < --- we cannot have a requirement for each key
    ordersById(ids: $ids) {
        dimension(dimension: $requirement) 
    }
}

Apollo Federation's _entities field introduces a workaround, allowing partial data representation without the need for untyped inputs. While effective, an ideal solution would avoid necessitating subgraphs to introduce special fields like _entities.

extend type Query {
  _entities(representations: [_Any!]!): [_Entity]!
}

The _entities field allows to pass in data that represents partial data of an object. This works around how GraphQL works and introduces untyped inputs. Ideally we want to find a way for batching requests that do not require a subgraph to introduce a field like _entities.

Batching Approaches

The GraphQL ecosystem has devised various batching approaches, each with its own set of advantages and drawbacks.

Request Batching

Request Batching is the most straightforward approach, where multiple GraphQL requests are sent in a single HTTP request. This method is widely adopted due to its simplicity and compatibility with many GraphQL servers. However, the lack of semantical relation between the batched requests limits optimization opportunities, as each request is executed in isolation. This could result in inefficiencies, especially when there are potential overlaps in the data required by each request.

[
  {
      "query": "query getHero { hero { name } }",
      "operationName": "getHero",
      "variables": {
          "a": 1,
          "b": "abc"
      }
  },
  {
      "query": "query getHero { hero { name } }",
      "operationName": "getHero",
      "variables": {
          "a": 1,
          "b": "abc"
      }
  },
]

Pros:

  • Broad adoption across GraphQL servers.
  • Straightforward implementation.

Cons:

  • Executes each request in isolation, lacking semantical relation.
  • Challenges in optimizing due to isolated execution.

Operation Batching

Operation Batching, as shown by Lee Byron in 2016, leverages the @export directive to flow data between operations within a single HTTP request. This approach introduces the ability to use the result of one operation as input for another, enhancing flexibility and enabling more complex data fetching strategies. The downside is the complexity of implementation and the fact that it’s not widely adopted, which may limit its practicality for some projects. Additionally, it does not really target our problem space.

POST /graphql?batchOperations=[Operation2,Operation1]
{
  "query": "query Operation1 { stories { id @export(as: \"storyIds\") } } query Operation2($storyIds: [Int!]!) { soriesById(ids: $ids) { name } }"
}

Pros:

  • Facilitates data flow between requests.

Cons:

  • Complex implementation.
  • Limited adoption
  • Niche application (precursor of defer).

Variable Batching

Variable Batching addresses a specific batching use case by allowing a single request to carry multiple sets of variables, potentially enabling more optimized execution paths through the executor. In experimentations we could reduce the batching overhead to the impact a DataLoder has on a request, which is promising.

{
  "query": "query getHero($a: Int!, $b: String!) { field(a: $a, b: $b) }",
  "variables": [
    {
      "a": 1,
      "b": "abc"
    },
    {
      "a": 2,
      "b": "def"
    }
  ]
}

Pros:

  • Optimizes a single request path.
  • Relatively simple to implement.

Cons:

  • Limited adoption.

Alias Batching

Alias Batching uses field aliases to request multiple resources within a single GraphQL document, making it possible with every spec-compliant GraphQL server. This method’s strength lies in its compatibility and ease of use. However, it significantly hinders optimization because each GraphQL request is essentially a unique request, preventing effective caching strategies (validation, parsing, query planing). While it might solve the immediate problem of batching requests, its impact on performance and scalability makes it not ideal.

{
  a: product(id: 1) {
    ...
  }
  b: product(id: 2) {
    ...
  }
  c: product(id: 3) {
    ...
  }
}

Pros:

  • Compatible with all GraphQL servers.
  • Simple to use for batching requests.

Cons:

  • Hinders optimization due to treating each request as unique.
  • Prevents effective caching strategies (validation, parsing, query planing).
@kamilkisiela
Copy link

As you mentioned, only the Alias Batching option is compatible with existing GraphQL servers.

Opting for any alternative method forces the gateway to be aware about which GraphQL servers support this method and it may become challenging.
Either the schema says "I'm compatible with X" (I'd prefer this option) or the GraphQL server (requires some form of "introspection" and becomes a mess when you deal with multiple instances or gradual deployments).

IMO Variable Batching is strongest option here.

We need to make sure it's compliant with GraphQL spec (not graphql-over-http spec).

@dariuszkuc
Copy link

dariuszkuc commented Mar 7, 2024

Variable batching would probably be the best approach and it was actually proposed as an addition to the GraphQL spec a while back (#517). That being said even if it is accepted as part of the official spec it would take a while for a wide adoption... as such I think the only viable option* is the alias batching.

*many frameworks do support request batching but since it is not part of an official spec, I'd imagine there are still quite a few servers might not support it.

@benjie
Copy link
Member

benjie commented Mar 7, 2024

Variable batching gets my vote; definitely feels the most GraphQL-y; and for people who use DataLoaders on context already they just need to share the context across the multiple queries and executing them in parallel should have minimal overhead.

@smyrick
Copy link

smyrick commented Mar 12, 2024

Variable batching also seems like the best option if we know that we can always execute the same operation and we are only ever selecting one entity but changing the id.

Wouldn't we want to though also support selecting different entities from one subgraph fetch? If we have to resolve 3 Foos but 10 Bars using variable batching would require use to have a nullable entity fetcher, which I suppose could be a requirement, or would it better to scope it per-request?

[
{
  "query": "query getFoo($a: Int!) { foo(a: $a) }",
  "variables": {
      "a": 1
    }
},
{
  "query": "query getFoo($a: Int!) { foo(a: $a) }",
  "variables": {
      "a": 2
    }
},
{
  "query": "query getBar($b: Int!) { bar(b: $b) }",
  "variables": {
      "b": 1
    }
},
]

VS

{
  "query": "query getFooAndBar($a: Int, b: Int) { foo(a: $a) bar(b: $b) }",
  "variables": [
    { "a": 1 }, { "a": 2 }, { "b": 1 }, 
  ]
},

We could also add the option of request AND variable batching

[
{
  "query": "query getFoo($a: Int!) { foo(a: $a) }",
  "variables": [
    {"a": 1 }, {"a": 2 }
  ]
},
{
  "query": "query getBar($b: Int!) { bar(b: $b) }",
  "variables": {
      "b": 1
    }
},
]

@kamilkisiela
Copy link

My take on batching non-unique operation bodies.

When multiple operations are batched together, there is a risk. Fetching 'Bar' consumes 2s, while 'Foo' is optimized and only takes 90ms.
In such cases, the overall time required for the batched operations will be determined by the slowest one.

While this is true for all types of batching, enforcing a single operation body helps minimize the impact, as the execution flow is likely to be consistent for all variables.

We could stream the response of each execution, but I don't think it will improve performance of a gateway, as every batched operation is most likely required to resolve first, before resolving the next step of a query planner.

@andimarek
Copy link

I understand that this has much wider implications, but just to put it out there: the source Source Schema 1 could also provide a root field which can be queried likes this:

query($ids: [ID!]! ,$requirements: [ProductDimensionInput!])  {
   ordersDimensions(ids: $ids, dimensions: $requirements) 
}

This would not require any special batching mechanism at all.

o0Ignition0o added a commit to apollographql/router that referenced this issue May 23, 2024
…tes (#5097)

Two main things that we're doing in this PR. 
1. We've added a variable to FetchNode called `context_rewrites`. This is a vector of DataRewrite::KeyRenamer that are specifically taking data from their path (which will be relative and can traverse up the data path) and writes the data into an argument that is passed to the selection set.
2. There are two cases. In the most straightforward, the data that is passed to the selection set is the same for every entity. This case is pretty easy and doesn't require any special handling. In the second case, the value of the variable may be different per entity. If that is true, we need to use aliasing and duplication in our query in order to send it to subgraphs. Once graphql/composite-schemas-spec#25 is decided and has subgraph support, this query cloning will be able to go away. 

Co-authored-by: o0Ignition0o <[email protected]>
Co-authored-by: Gary Pennington <[email protected]>
lrlna pushed a commit to apollographql/router that referenced this issue Jun 3, 2024
…tes (#5097)

Two main things that we're doing in this PR. 
1. We've added a variable to FetchNode called `context_rewrites`. This is a vector of DataRewrite::KeyRenamer that are specifically taking data from their path (which will be relative and can traverse up the data path) and writes the data into an argument that is passed to the selection set.
2. There are two cases. In the most straightforward, the data that is passed to the selection set is the same for every entity. This case is pretty easy and doesn't require any special handling. In the second case, the value of the variable may be different per entity. If that is true, we need to use aliasing and duplication in our query in order to send it to subgraphs. Once graphql/composite-schemas-spec#25 is decided and has subgraph support, this query cloning will be able to go away. 

Co-authored-by: o0Ignition0o <[email protected]>
Co-authored-by: Gary Pennington <[email protected]>
@xuorig
Copy link
Member

xuorig commented Sep 15, 2024

Was @andimarek's suggestion discussed? Introducing other modes of entity resolution seems to diminish the static and explicit benefits of entityResolver, it feels much more desirable to me to go the entire way towards explicit entity resolvers.

Tooling and validation can help identify missing resolvers and generate them if needed (haven't though it through entirely, is it possible?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants