Implement `ApplyToInternal::compute_output_shape` and `JSONSelection::shape` returning `shape::Shape` #6458

benjamn · 2024-12-13T23:00:55Z

This draft PR uses our newly published shape crate to implement the ApplyToInternal::compute_output_shape trait method, which returns a shape::Shape, which is an immutable and reference-counted Rust data structure that describes a set of constraints on JSON data. The "shape" terminology is a close synonym for "type" but I'm using it to emphasize the immutable, reference-free, value-typed nature of the JSON domain.

As an important part of this ::compute_output_shape system, all -> methods now define a method_shape function that allows them to reject unexpected input shapes and compute their own output shape in terms of a given input shape and variable shapes. In other words, the type signatures of -> methods are expressed using Rust code which ultimately returns a shape::Shape, which then gets further processed by the compute_output_shape logic, allowing -> methods to blend seamlessly into the shape processing system.

This new ability to compute a static output Shape for a given JSONSelection string allows us to relax the constraint that inline PathSelection items must have a static {...} subselection at the end. The static subselection was previously our only means of statically determining the output fields of the PathSelection, but now we can statically/programmatically examine the output Shape, which doesn't distinguish between static subselections and -> method results (as long as they have the same computed object shape). We are also now requiring a ... token before inline PathSelection items without static subselections, and encouraging the ... token for any inline PathSelection, even if it has a subselection (though enforcing it in that case would be a breaking change).

Finally, this PR proposes a way of setting the __typename in situations where abstract GraphQL types (interfaces and unions) are involved: by writing a { <Type> ... } annotation at the beginning of any subselection, developers can indicate to both the Shape system and the runtime system that the __typename of this output object should have the literal shape "Type", and the object should include __typename: "Type" at runtime. The underlying Shape representation actually does use a __typename field to model this information, but that's a choice made by the JSONSelection code, not something embedded in the Shape library.

I wish I could have broken this PR up more, though I was able to split out #6455, #6456, and #6457.

I don't believe this PR should introduce any logical runtime differences of behavior in the connectors system, since it's basically adding the compute_output_shape method and then only using it in tests. Once I address a few TODOs, the output shapes will play a role in checking the validity of ...-inlined PathSelection items, but the ... syntax is new with this PR, so changing it is still easy.

svc-apollo-docs · 2024-12-13T23:00:59Z

✅ Docs Preview Ready

No new or changed pages found.

router-perf · 2024-12-13T23:01:26Z

CI performance tests

benjamn · 2024-12-13T23:11:05Z

apollo-federation/src/sources/connect/json_selection/README.md

 ```ebnf
 JSONSelection        ::= PathSelection | NamedSelection*
-SubSelection         ::= "{" NamedSelection* "}"
+SubSelection         ::= "{" OutputTypeAnnotation? NamedSelection* "}"
+OutputTypeAnnotation ::= "<" Identifier ">"
 NamedSelection       ::= NamedPathSelection | PathWithSubSelection | NamedFieldSelection | NamedGroupSelection
-NamedPathSelection   ::= Alias PathSelection
+NamedPathSelection   ::= (Alias | "...") PathSelection
 NamedFieldSelection  ::= Alias? Key SubSelection?
 NamedGroupSelection  ::= Alias SubSelection


Here's the only new JSONSelection syntax we need: { <Type> ... } annotations and ... as a way of denoting inline NamedPathSelection items.

benjamn · 2024-12-13T23:12:32Z

apollo-federation/src/sources/connect/json_selection/parser.rs

+        } else if let Ok((remainder, _dots)) =
+            tuple((spaces_or_comments, ranged_span("...")))(input)
+        {
+            // TODO Enforce path has a static output shape with known object keys.


Here's a remaining TODO.

benjamn · 2024-12-13T23:12:56Z

apollo-federation/src/sources/connect/json_selection/parser.rs

+                    // TODO If the path is inlined with ... and has no explicit
+                    // SubSelection, we should use the computed Shape of the
+                    // path to determine the output fields.


Another TODO I'm planning to get to before taking this PR out of draft.

benjamn · 2024-12-13T23:13:20Z

apollo-federation/src/sources/connect/json_selection/parser.rs

+                    } else if *inline {
+                        // If the PathSelection is inlined with ... but does not
+                        // have a subselection, we can use the computed Shape of
+                        // the path to synthesize the output selections.
+                        // TODO Implement this.


Another TODO for the list.

lennyburdette · 2024-12-16T17:11:42Z

apollo-federation/src/sources/connect/json_selection/README.md

 NamedSelection       ::= NamedPathSelection | PathWithSubSelection | NamedFieldSelection | NamedGroupSelection
-NamedPathSelection   ::= Alias PathSelection
+NamedPathSelection   ::= (Alias | "...") PathSelection


does this allow for future keywords like if or match directly after the ...?

If the PathSelection started with an if token, that would be a potential concern for making if a keyword in the future, but if we stick with the the idea of a parenthesized test (i.e. ... if (test) {}), the ( character should prevent parsing as a PathSelection, causing the parser to backtrack and consider other ways of parsing ... if (test) {} (which we can define).

Same goes for ... match (value), as long as we use parentheses, or as long as we're willing to make the match field illegal immediately after ... (a breaking change but not unthinkable).

lennyburdette · 2024-12-16T17:14:28Z

apollo-federation/src/sources/connect/json_selection/apply_to.rs

@@ -69,6 +73,44 @@ impl JSONSelection {

        (value, errors.into_iter().collect())
    }
+
+    pub fn shape(&self) -> Shape {


what's the reason for having shape code in this file? it feels different enough to warrant a separate file and maybe a separate trait, even if it's structured basically the same

It made the code a little easier to write to have apply_to_path nearby, but there's probably no harm in moving it to a new file now.

lennyburdette · 2024-12-16T17:39:01Z

apollo-federation/src/sources/connect/json_selection/methods/public.rs

+
+        for pair in args {
+            if let LitExpr::Array(pair) = pair.as_ref() {
+                if pair.len() == 2 {


should we return an error if the length isn't exactly 2?

do you have an example of error handling? since this doesn't return a Result, what should a caller do with a Shape::error_with_range?

I imagine we'll add a way to collect/traverse/visit any ShapeCase::Error variants within a given Shape, to handle them programmatically or surface them in the debugger, but for now the main point of errors is to fail validation against expected GraphQL schema types.

Since you can embed ShapeCase::Error within a larger shape, shape errors are similar to the GraphQL errors-as-data pattern, where you're able to get back some partial shape information along with contextual errors.

This syntax allows us to model the GraphQL behavior of automatically mapping selection sets over lists of objects, even when we can't determine (yet) whether the input has an array shape or not (e.g. any undeclared named shape reference). Once you provide enough input shape information to resolve the array ambiguity, you won't see as many `.*` segments in named shape subpaths (and you also won't see as many named shape references in general, because you've probably provided at least some of the shape definitions for those names).

TODO Make sure we audit any validations preventing key collisions, because we now have a more graceful way of handling the collisions.

The special `{ <Type> }` syntax for selection sets still works as a shorthand for `{ __typename: "Type" }`, and may be more useful for static analysis because it enforces `Type` is an identifier. We may even want to forbid setting `__typename` to anything other than a (union of) string literal shapes, or forbid setting `__typename` directly altogether, in favor of the `<Type>` syntax, since it provides the flexibility we're comfortable with.

This allows us to postpone PR #6456 for further consideration.

apollo-bot2 · 2024-12-16T20:02:32Z

Detected SAST Vulnerabilities

🔴 Potential Security Issues Found

If you are seeing this message, it means that the security scanning tool that Apollo uses to test our source has identified a potential security issue in code introduced or updated via your branch. Information about what was detected and steps to move forward are below.

If the scanner detected a legitimate issue, please take action to correct it prior to merging this PR. The action required will vary based on the detection. If the detection is a false positive, please follow the steps below to resolve the issue.

Issues Detected

Message	Caused CI Failure?	Link	Path	Signature
Insecure WebSocket Detected. WebSocket Secure (wss) should be used for all WebSocket connections.	Yes	Link	apollo- router/src/protocols/websocket.rs	rules.providers.semgrep.security.ja vascript.lang.security.detect- insecure-websocket
Insecure WebSocket Detected. WebSocket Secure (wss) should be used for all WebSocket connections.	Yes	Link	apollo- router/src/protocols/websocket.rs	rules.providers.semgrep.security.ja vascript.lang.security.detect- insecure-websocket

False Positive Resolution Process

The easiest way to resolve a false positive is to add a comment containing nosemgrep to the code that triggered the detection. You can add the comment directly to the line triggering the detection or on the line immediately above it. For example:

#!/usr/bin/env python3

def my_function():
    my_code_generating_detection() # nosemgrep

What if I can't add a line comment?

If you can't add a comment because the detection is generated by a file that does not support comments (like JSON), you can use Semgrep's .semgrepignore file. For more information about how to create an exclusion via .semgrepignore, refer to the Semgrep documentation.

Getting Help

The Apollo Security team is available to assist in resolving this issue. Please tag us on this PR using @apollographql/security if you need assistance!

How do I know I fixed this correctly?

Once you have resolved the issue completely, this message will disappear! If you're still seeing this message, there is more to do prior to merging.

benjamn self-assigned this Dec 13, 2024

benjamn requested review from lennyburdette, nicholascioli, dylan-apollo and pubmodmatt December 13, 2024 23:01

benjamn commented Dec 13, 2024

View reviewed changes

benjamn changed the base branch from benjamn/JSONSelection-withError-arrow-method to benjamn/JSONSelection-variadic-ExprPath December 16, 2024 16:56

lennyburdette reviewed Dec 16, 2024

View reviewed changes

benjamn force-pushed the benjamn/JSONSelection-variadic-ExprPath branch from 909e897 to e5eedde Compare December 16, 2024 18:24

benjamn force-pushed the benjamn/JSONSelection-compute_output_shape branch from 15b6b8c to 1fde6d8 Compare December 16, 2024 18:24

benjamn force-pushed the benjamn/JSONSelection-variadic-ExprPath branch from e5eedde to 16d0306 Compare December 16, 2024 18:48

benjamn force-pushed the benjamn/JSONSelection-compute_output_shape branch from 1fde6d8 to ebb3e95 Compare December 16, 2024 19:03

benjamn force-pushed the benjamn/JSONSelection-variadic-ExprPath branch from 16d0306 to dfeb558 Compare December 16, 2024 19:12

benjamn added 12 commits December 16, 2024 14:30

Remove unused generic parameter from Ranged trait.

41110e2

Implement Hash for WithRange<T> only when T: Hash.

812fee2

Implement/use PrettyPrintable for Alias and Key.

9e01f86

Remove redundant NamedSelection::apply_to_path array base case.

6c72bf6

Use 0.3.1 prerelease of shape crate.

d19fd7a

Implement ApplyToInternal::compute_output_shape trait method.

6003cc7

Update to latest Shape::* helper methods.

e12a72d

Relax inline path SubSelection requirement and add shape tests.

85dfb20

Track optional <T> output shape of SubSelection.

2d2a265

Permit ... before PathSelection without SubSelection.

b0253ad

Update snapshots.

10ac591

benjamn added 7 commits December 16, 2024 14:30

Merge output keys recursively upon collisions, following shape logic.

df5fd01

TODO Make sure we audit any validations preventing key collisions, because we now have a more graceful way of handling the collisions.

Implement ->then method for lightweight conditional logic.

d649cea

Rename JSONSelection::static_shape to ::shape.

e93ff44

Add OutputTypeAnnotation to SubSelection grammar.

1a37845

Address 'cargo xtask lint' feedback.

cba51a9

Adjust $(...) implementation/tests to postpone multi-arg version.

70486c1

This allows us to postpone PR #6456 for further consideration.

benjamn force-pushed the benjamn/JSONSelection-compute_output_shape branch from ebb3e95 to 70486c1 Compare December 16, 2024 20:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement `ApplyToInternal::compute_output_shape` and `JSONSelection::shape` returning `shape::Shape` #6458

Implement `ApplyToInternal::compute_output_shape` and `JSONSelection::shape` returning `shape::Shape` #6458

benjamn commented Dec 13, 2024 •

edited

Loading

svc-apollo-docs commented Dec 13, 2024 •

edited

Loading

router-perf bot commented Dec 13, 2024

benjamn Dec 13, 2024 •

edited

Loading

benjamn Dec 13, 2024

benjamn Dec 13, 2024

benjamn Dec 13, 2024

lennyburdette Dec 16, 2024

benjamn Dec 16, 2024 •

edited

Loading

lennyburdette Dec 16, 2024

benjamn Dec 16, 2024 •

edited

Loading

lennyburdette Dec 16, 2024

lennyburdette Dec 16, 2024

benjamn Dec 16, 2024

apollo-bot2 commented Dec 16, 2024

False Positive Resolution Process

What if I can't add a line comment?

Getting Help

How do I know I fixed this correctly?

Implement ApplyToInternal::compute_output_shape and JSONSelection::shape returning shape::Shape #6458

Are you sure you want to change the base?

Implement ApplyToInternal::compute_output_shape and JSONSelection::shape returning shape::Shape #6458

Conversation

benjamn commented Dec 13, 2024 • edited Loading

svc-apollo-docs commented Dec 13, 2024 • edited Loading

✅ Docs Preview Ready

router-perf bot commented Dec 13, 2024

benjamn Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjamn Dec 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjamn Dec 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apollo-bot2 commented Dec 16, 2024

Detected SAST Vulnerabilities

🔴 Potential Security Issues Found

Issues Detected

False Positive Resolution Process

What if I can't add a line comment?

Getting Help

How do I know I fixed this correctly?

Implement `ApplyToInternal::compute_output_shape` and `JSONSelection::shape` returning `shape::Shape` #6458

Implement `ApplyToInternal::compute_output_shape` and `JSONSelection::shape` returning `shape::Shape` #6458

benjamn commented Dec 13, 2024 •

edited

Loading

svc-apollo-docs commented Dec 13, 2024 •

edited

Loading

benjamn Dec 13, 2024 •

edited

Loading

benjamn Dec 16, 2024 •

edited

Loading

benjamn Dec 16, 2024 •

edited

Loading