Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: expression validation and evaluation with CEL #1666

Draft
wants to merge 36 commits into
base: main
Choose a base branch
from
Draft

Conversation

davenewza
Copy link
Contributor

@davenewza davenewza commented Nov 5, 2024

Expression parsing using CEL

📖 See Computed Fields product doc here: https://www.notion.so/keelhq/Computed-fields-fa1f91f188134ece9b2072e9b22e0c1e

Why?

Computed fields will introduce a lot more complexity to Keel expressions. We'll need to support:

  • Arithmetic operations, e.g. item.price - item.cost
  • Functions with arguments, e.g. ABS(value) or DURATION(start, end) or YEAR(date)
  • Aggregate functions which operate on 1:M data, e.g. SUM(items.price)
  • Expression nesting with proper operator/function precedence, e.g. SUM(account.transactions.value) + account.overdraft > 0 or account.isLocked

Our previous expression AST and parsing implementation would need to be significantly extended to support these new features and could would become quite a substantial amount of code to write and maintain, especially if we want to have really good type checking and validations.

The cel-go package parses, validates and evaluates CEL expressions and also supports everything raised above. CEL is almost identical to Keel expressions, and we've proposed adopting it so not to spend precious energy on extending our codebase to do almost exactly what cel-go would do for us.

Technical overview

The main objective of this PR is to reach feature parity and not to introduce anything new yet to our expressions language.

These points mostly cover what has changed in this PR:

  • Our AST doesn't describe expressions by their distinct parts anymore.
  • Each attribute has its own cel-go configuration used for parsing.
  • Expression validation is now done by cel-go. We do not need to interrogate the AST anymore.
  • Validation errors from cel-go are intercepted and made more friendly. Fortunately we still support discrete token positions with these validation messages.
  • Expression evaluation (for example, to generate SQL) is done using visitors.
  • SQL gen tests for @where, @set and @permission all remain mostly unchanged.

Primary packages

I think it's useful to briefly explain the main packages involved.

expressions: The abstraction over CEL which can validate expressions with Validate(*parser.Expression)

expressions/options: A library of options which can be used to configure the validator. That's important because each attribute would configure their expressions with more or less context and features. For example, a @where expression would be configured with its action input variables and with ctx whereas a @default attribute would not.

expressions/visitor: An expression visitor runner which can be used to interrogate expressions if need be. For example, this can be used to generate SQL or read idents (see next).

expressions/resolve: Various ways to resolve expressions down to values, idents, ident arrays, etc. Sometimes we actually only care about constants and not evaluation, for example, @unique([sku, supplierCode]).

parser/attributes: Parser configuration for each attribute type. Our validation code only needs to really call on these to validate their expressions. E.g. issues, err := attributes.ValidateWhereExpression(asts, action, expression)

What's still to do?

  • Validation messages need to be improved as they are less useful than what they used to be.
  • Validation message hints need to be reintroduced.
  • The functions runtime permission generation needs to be supported. This should be simple enough to do by implementing a new visitor.
  • DateTime and Date comparison support needs to be added to the cel-go configuration options.
  • Optional inputs and null fields need stricter validation checking still.
  • not in is currently not supported.
  • Migrating to using && and || - breaking change - more to follow.
  • There is a bug with token positions.
  • Fix the rest of the schema validation tests

@davenewza davenewza marked this pull request as draft November 5, 2024 05:55
@davenewza davenewza added the WIP work in progress label Nov 5, 2024
@davenewza davenewza changed the title poc/cel WIP: expression validation and evaluation with CEL Nov 27, 2024
Copy link
Member

@jonbretman jonbretman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do worry we're going to make the validation errors a lot less friendly, which will hurt our less technical users more maybe if they mostly work in the schema.

Do you have a plan for how we can make them better?

"github.com/teamkeel/keel/schema/validation/errorhandling"
)

func TestParser_Variable(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like these tests would really benefit from using fixtures, a lot of duplication in in this file. Would be nice to use the same/similar approach we use for validation tests with schema files that contain expected errors and just loop over them.

Any reason you didn't do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point. The only reason why I didn't do this is because I've been going about this PR with a bit of a test-driven approach and just happened to not use fixtures, but agreed that this needs to change. Will add it to the list

type TypeProvider struct {
Schema []*parser.AST
Model string
Objects map[string]map[string]*types.Type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add a comment here explaining what the keys of the maps are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do - thanks

}

stmt.expression += ")"
// stmt.expression += "("
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the functions runtime stuff?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it is. I need to fit this code into a cel visitor which should be relatively straight forward.

if err != nil {
return err
}
// fragments, err := lhsResolver.NormalisedFragments()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ta

"github.com/teamkeel/keel/schema/reader"
)

func TestUnique_Valid(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again feels like these would be better using fixtures.

// ToAssignmentExpression splits an assignment expression into two seperate expressions.
// E.g. the expression `post.age = 1 + 1` will become `post.age` and `1 + 1`
func (expr *Expression) ToAssignmentExpression() (*Expression, *Expression, error) {
parts := strings.Split(expr.String(), "=")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An edge case but wouldn't this break with an expression like model.field = "=== HELLO ==="

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonbretman Ah yes, what I've done is quite crude. I should rather find the assignment operator by iterating through the tokens. Will do that 👍

}

actions {
update updatePost(id) with (title) {
//expect-error:18:28:ActionInputError:title is already being used as an input so cannot also be used in an expression
@set(post.title = title)
}
update updatePost2(id) with (title) {
//expect-error:18:44:AttributeExpressionError:undeclared reference to 'post' (in container '')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're not wrong about the error messages being less helpful. The error here is that it's using post on the RHS which isn't allowed right? How can we make these errors better? in container '' is pretty cryptic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, this is actually fixed - Im still in the progress of updating all these tests. It's actually erroring with unknown identifier 'post', but even that can be improved further

@davenewza
Copy link
Contributor Author

I do worry we're going to make the validation errors a lot less friendly, which will hurt our less technical users more maybe if they mostly work in the schema.

Do you have a plan for how we can make them better?

I agree that this will be problem if not addressed. My plan is to expand on converting cel-go validation errors to customer-friendly ones. This is working pretty well so far here but as mentioned needs to be expanded on. I also need to reintroduce hints for each attribute type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants