Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate alternative translations for (in)equality comparison #34165

Open
ranma42 opened this issue Jul 5, 2024 · 5 comments · May be fixed by #34172
Open

Investigate alternative translations for (in)equality comparison #34165

ranma42 opened this issue Jul 5, 2024 · 5 comments · May be fixed by #34172

Comments

@ranma42
Copy link
Contributor

ranma42 commented Jul 5, 2024

The current translation of nullableA == B is nullableA = B AND nullableA IS NOT NULL.
Similarly, nullableA != B is translated to nullableA <> B OR nullableA IS NULL.

This causes the duplication of the nullableA expression.

Alternative translations that could avoid this issue are CASE WHEN nullableA == b THEN TRUE ELSE FALSE END / CASE WHEN nullableA == b THEN FALSE ELSE TRUE END.

@ranma42
Copy link
Contributor Author

ranma42 commented Jul 5, 2024

Note that this would not fix the issue of double evaluation on SqlServer unless #32519 is fixed.

EDIT: this was referring to the older alternative translation COALESCE(nullableA = B, FALSE) / COALESCE(nullableA <> B, TRUE).

@ranma42 ranma42 changed the title Investigate alternative translations for inequality comparison Investigate alternative translations for (in)equality comparison Jul 5, 2024
@ranma42
Copy link
Contributor Author

ranma42 commented Jul 5, 2024

⚠️ while this alternative translation is interesting to avoid duplication of sub-expressions, it might cause worse plans to be generated, for example when nullableA is a simple column expression.

For example Sqlite could use indexes on a predicates like nullableA <> B OR nullableA IS NULL / nullableA = B AND nullableA IS NOT NULL.

@ranma42
Copy link
Contributor Author

ranma42 commented Jul 5, 2024

For example Sqlite could use indexes on a predicates like nullableA <> B OR nullableA IS NULL [...]

The index would not be useable for the nullableA <> B, so for the inequality it looks like a safe transformation.
For equality, more care might be needed. The same holds true for other comparisons; see 4db69e0 where the same CASE structure is used (and the same issues with indexes would be hit; on the bright side, that translation is only used when !optimize, aka outside of predicates).

ranma42 added a commit to ranma42/efcore that referenced this issue Jul 5, 2024
When comparing a nullable expression to a non-nullable one, a  `NULL` result  always
represent a difference.

This makes it possible to avoid duplicating the nullable expression by mapping
the `NULL` result to a `FALSE` (when comparing for equality).

Fixes dotnet#34165.
@ranma42 ranma42 linked a pull request Jul 5, 2024 that will close this issue
ranma42 added a commit to ranma42/efcore that referenced this issue Jul 6, 2024
When comparing a nullable expression to a non-nullable one, a  `NULL` result  always
represent a difference.

This makes it possible to avoid duplicating the nullable expression by mapping
the `NULL` result to a `FALSE` (when comparing for equality).

Fixes dotnet#34165.
@roji
Copy link
Member

roji commented Jul 8, 2024

The current translation of nullableA == B is nullableA = B AND nullableA IS NOT NULL.

This only referes to "non-optimized" mode, right?

on the bright side, that translation is only used when !optimize, aka outside of predicates).

Note that !optimize does occur in predicate, e.g. within negation.

In any case, doesn't the switch to IS NOT DISTINCT FROM help here?

@ranma42
Copy link
Contributor Author

ranma42 commented Jul 8, 2024

The current translation of nullableA == B is nullableA = B AND nullableA IS NOT NULL.

This only referes to "non-optimized" mode, right?

Yes (for the !=/OR case, instead, we always emit the nullableA <> B OR nullableA IS NULL).

on the bright side, that translation is only used when !optimize, aka outside of predicates).

Note that !optimize does occur in predicate, e.g. within negation.

In any case, doesn't the switch to IS NOT DISTINCT FROM help here?

Yes, IS NOT DISTINCT FROM is a better solution, but requires explicit support from the DB provider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants