Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contains to OPENJSON translation regresses performance #32394

Open
Suchiman opened this issue Nov 22, 2023 · 128 comments
Open

Contains to OPENJSON translation regresses performance #32394

Suchiman opened this issue Nov 22, 2023 · 128 comments

Comments

@Suchiman
Copy link
Contributor

Suchiman commented Nov 22, 2023

After upgrading to EFC8, we've run into several severe performance regressions where millisecond queries started timeouting.
This is due to EFC8 now generating

DECLARE @__p_1 int = 0
DECLARE @__p_2 int = 25
DECLARE @__profileIds_0 nvarchar(4000) = N'[923]'
SELECT [a].[AmsPk] AS [Id], [a].[Bearbeitet], [a].[Amsidnr]
FROM [AmsKunden] AS [a]
INNER JOIN [StorageProfiles] AS [s] ON [a].[Profile_Id] = [s].[Id]
WHERE [s].[Id] IN (
    SELECT [p].[value]
    FROM OPENJSON(@__profileIds_0) WITH ([value] int '$') AS [p]
)
ORDER BY [a].[Bearbeitet] DESC
OFFSET @__p_1 ROWS FETCH NEXT @__p_2 ROWS ONLY

image

where it used to generate

DECLARE @__p_1 int = 0
DECLARE @__p_2 int = 25
SELECT [a].[AmsPk] AS [Id], [a].[Bearbeitet], [a].[Amsidnr]
FROM [AmsKunden] AS [a]
INNER JOIN [StorageProfiles] AS [s] ON [a].[Profile_Id] = [s].[Id]
WHERE [s].[Id] IN (923)
ORDER BY [a].[Bearbeitet] DESC
OFFSET @__p_1 ROWS FETCH NEXT @__p_2 ROWS ONLY

image

from my analysis, the problem here is that the cardinality estimator flat assumes that OPENJSON will return 50 rows. If the column that you're filtering on is not very selective, that is enough to dissuade SQL Server from seeking that index. In addition, it also dissuades it from using filtered indexes which requires constants but that's orthogonal. I have a lot of queries where i do .Where(x => col.Contains(x.SomeId)) where col contains in 99% of the time just one element, with the very rare 0 or occasional 2 elements (although more are possible in theory).

Since that is absolutely blocking, i had to apply the CompatLevel 120 hack but i consider that quite the nuclear option, especially since we would like to use more of the ToJson features. The only other option i could see to get around that was to apply a FORCESEEK hint but this isn't (well) supported in EFC either.

CompatLevel 120 works for now but i don't think that's a permanent solution. Query Cache poisoning and frequent recompilations are not remotely as expensive as queries that regress from milliseconds to "can't finish in 120s", so this feature comes at a trade off that is not worth it for us. The naive better solution to workaround this would be similiar to the SplitQuery feature (that has both a global and query level switch).

Include provider and version information

EF Core version: 8.0
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .NET 8.0
Operating system: Windows 10
IDE: Visual Studio 2022 17.9P1

@roji
Copy link
Member

roji commented Nov 23, 2023

@Suchiman is it possible for you to provide a repro for this performance regression, i.e. data and a query which show the new OPENJSON-based approach performing significantly worse? That would help deciding what to do here.

@stevendarby
Copy link
Contributor

@Suchiman are the FORCESEEKs in your queries above generated by EF?

@roji sorry if this clouds matters but a problem stemming from cardinality estimates was raised in the original issue and I don't think it was tracked in a separate issue as you requested #13617 (comment). Just a little reminder now in case it's useful to consider alongside this issue.

@Suchiman
Copy link
Contributor Author

are the FORCESEEKs in your queries above generated by EF?

sorry, no, copied the wrong thing where i was tinkering if query hints would help.

data and a query which show the new OPENJSON-based approach performing significantly worse?

I'll try, the databases i'm working with are anywhere between 400GB and 3TB right now so this might need a whole lot of data to start being reproducible.

@ErikEJ
Copy link
Contributor

ErikEJ commented Nov 23, 2023

@Suchiman How could EF Core determine that not using openjson is better in this case?

@Suchiman
Copy link
Contributor Author

@Suchiman How could EF Core determine that not using openjson is better in this case?

Not using OPENJSON is likely always better for query runtime as SQL Server can optimize constants a lot better than an opaque input, there are tradeoffs however on query compilation times, tradeoffs that EFC cannot reasonably make assuming it will always improve things. Thus why i suggested of making it a query level option, akin to AsSplitQuery.

@roji
Copy link
Member

roji commented Nov 24, 2023

@Suchiman @stevendarby thanks for your comments; I'm currently away but will be back next and will fully look into this in more detail.

I agree that there needs to be a way to make EF generate constants for Contains, ideally on a per-query basis. I don't think a new operator is needed - such as AsSplitQuery; what's needed is a way to force the array parameter to be interpreted by EF as a constant instead of as a parameter - that's a more general need that we know we have (there are other places where this is a problem). In other words, I think that to force constantization you'd write something like the following (see #31552):

var ids = new[] { 1, 2 };
_ = await ctx.Blogs.Where(b => EF.Constant(ids).Contains(b.Id)).ToListAsync();

In the meantime, as a pretty verbose workaround, you can use the Expression APIs to produce that exact expression tree:

var ids = new[] { 1, 2 };
ContainsMethodInfo ??= typeof(Enumerable).GetMethods()
    .Single(m => m.Name == nameof(Enumerable.Contains) && m.GetParameters().Length == 2);

var parameter = Expression.Parameter(typeof(Blog), "b");
var predicate = Expression.Lambda<Func<Blog, bool>>(
    Expression.Call(
        ContainsMethodInfo.MakeGenericMethod(typeof(int)),
        Expression.Constant(ids),
        Expression.Property(parameter, nameof(Blog.Id))),
    parameter);

var results = await ctx.Blogs.Where(predicate).ToListAsync();

This produces the same query tree as when the array is written inline in the query:

var results = await ctx.Blogs.Where(b => new[] { 1, 2 }.Contains(b.Id)).ToListAsync();

... and generates the desired SQL:

SELECT [b].[Id], [b].[Name]
FROM [Blogs] AS [b]
WHERE [b].[Id] IN (1, 2)

This is by no means a satisfactory workaround - I know - but I'm posting it here in case you want to keep the SQL Server compatibility level (for other queries) and override the behavior on a per-query basis, today with 8.0.0.

One note: if you know you have a certain number of elements in the parameterized array, you can use an inline array as follows:

var result = await ctx.Blogs.Where(b => new[] { ids[0], ids[1] }.Contains(b.Id)).ToListAsync();

This is recommended for when you know the number of elements, producing the following SQL:

SELECT [b].[Id], [b].[Name]
FROM [Blogs] AS [b]
WHERE [b].[Id] IN (@__p_0, @__p_1)

In any case, I'll take a look at how EF.Constant support would look like and whether it's even possible to consider it for an 8.0 patch.

@ErikEJ
Copy link
Contributor

ErikEJ commented Nov 24, 2023

@Suchiman You might find this extension method useful, uses parameters and simple ORs https://gist.github.com/ErikEJ/6ab62e8b9c226ecacf02a5e5713ff7bd

@stevendarby
Copy link
Contributor

@Suchiman I would just like to understand what's happening on the SQL Server side a bit more, if that's ok! Regarding this:

the problem here is that the cardinality estimator flat assumes that OPENJSON will return 50 rows. If the column that you're filtering on is not very selective, that is enough to dissuade SQL Server from seeking that index.

Is the implication here that SQL Server would also avoid seeking the index with an IN containing 50 hardcoded values (because it would know there are 50 and not just estimate that)? If so, and turns out it's really slow doing that - doesn't that seem like an odd decision for SQL Server to make? Are your statistics up to date - could that influence its decision?

Out of interest, does adding TOP 1 in the OPENJSON subquery correct its estimate and get it to use the index? I was half wondering if bucketizing a hint like that, so that the estimate is at least within the right order of magnitude, could be a solution: TOP 1, TOP 10, TOP 100 etc. Or maybe there is some other kind of hint that could be used. It would lead to multiple query plans for different sizes, but each might be better performing and the number of them still much lower than pre-EF 8. I've not investigated this though and it might be a complete non-starter.

Anyway, @roji's suggestion seems like a good way forward.

@roji
Copy link
Member

roji commented Nov 24, 2023

I also had @stevendarby's good questions in mind, and definitely still would like to see some sort of repro for this so that we can dig deeper into exactly what's going on... That could help us decide to what extent a patch here is needed for 8.0.0.

@Suchiman
Copy link
Contributor Author

Suchiman commented Nov 27, 2023

Is the implication here that SQL Server would also avoid seeking the index with an IN containing 50 hardcoded values (because it would know there are 50 and not just estimate that)?

Well SQL Server is really an oddball here. If i use WHERE [s].[Id] IN (@p1,@p2) then having 2 params is already enough to make it choose badly (estimated plan but it's the same bad shape):
image

If using hardcoded constants like WHERE [s].[Id] IN (1,2,3,4,5,6...), then it takes 61 constants before it tips over (estimated plan but it's the same bad shape)
image

If so, and turns out it's really slow doing that - doesn't that seem like an odd decision for SQL Server to make? Are your statistics up to date - could that influence its decision?

Doing a statistics update with fullscan is always the first thing i try, i wish that would help all the time 😆

Out of interest, does adding TOP 1 in the OPENJSON subquery correct its estimate and get it to use the index?

That does indeed work, and it seems like for SQL Server, the tipping point is exactly 50 in my case, with TOP (49) i still get the fast plan, with TOP (50) its the slow plan. One could think to make EFC generate TOP(@p_n), that allows both query plan caching and through the magic of parameter sniffing, to get a better plan, including all the gotchas of parameter sniffing such as getting a bad reused plan as well.

@roji
Copy link
Member

roji commented Nov 27, 2023

Thanks for all the extra info and experimentation @Suchiman, that's definitely useful. I know it isn't easy, but some sort of repro for this would allow further investigation and possibly taking this to the SQL Server people to maybe get more insights.

Otherwise, I'm generally averse to EF generating something that's super-tailored to an internal SQL Server quirk (i.e. the TOP - with what exact threshold etc.)... If we see that this is indeed some sort of general thing for all queries using IN, that might make sense, but otherwise I'm not sure what we can do here. You should at least be able to get EF to generate the TOP yourself, i.e. by composing a Take on the parameterized array.

@roji
Copy link
Member

roji commented Dec 4, 2023

FYI everyone, the EF.Constant solution should be there for 8.0.2. I'm still definitely interested in understanding the perf characteristics here deeper, but we're going to need to see some sort of repro - I hope you can help with that.

@stevendarby
Copy link
Contributor

Hi @roji, had a play with EF.Constant on the daily builds and it looks like EF compiles a query for each permutation of values passed to EF.Constant. I know the pre-EF 8 approach couldn't cache the SQL due to its hardcoded values, but pretty sure the query was cached in some form to avoid a full compile each time? Just thinking there may still be an argument for a per-query option to revert to the old behaviour (i.e. the suggestion OP put forward) if EF.Constant isn't the magic bullet. Still need good repros to prove the usefulness of that though...

@roji
Copy link
Member

roji commented Dec 5, 2023

@stevendarby you're right - this is indeed a difference between the new EF.Constant and the old behavior. EF's relational layer contains two levels of caching - one very early one based on the query tree itself, and a later one based only on the nullability of parameters (since SQL varies based on that). The old behavior had the same tree for the 1st cache, but specifically prevented use of the 2nd cache; because EF.Constant integrates the constants very early on in the query tree, that causes a miss in the 1st cache, causing the entire compilation to happen again.

When working on EF.Constant, I briefly considered trying to implement it in a way which doesn't defeat EF's 1st cache.. However, that's considerably more complex/risky (and the point here was to prepare a patch for 8.0, which must be relatively simple/low-risk). In addition, EF.Constant is useful for other scenarios where one really does want to integrate constants in the tree. It's true that a specific flag for Contains (as opposed to the more general EF.Constant) would be relatively easy, but we generally try to avoid having such things unless absolutely necessary.

Note that the majority of other issues related to the new Contains translation have been fixed (or are in the process of being fixed), so I hope EF.Constant won't be needed that much (though the possible performance issues described above do remain).

In any case, users now have an efficient, global (SQL Server) option to disable OPENJSON entirely, and will have EF.Constant as an EF-expensive but per-query option via EF.Constant. I do agree that there's probably still a "gap" there, i.e. a per-query option that's more efficient. There may even be a need for a global flag to opt out of JSON subquery translation for Contains specifically, for all queries (JSON subqueries are absolutely required for most/all queries composed over primitive collections, except for Contains where there's the alternative - so that may make sense). But it seems prudent to wait and see how it all works for users - as well as get a better understanding of the actual perf impact here, with a repro - and if really needed, do another patch later on.

Does that make sense?

@Nefarion
Copy link

Nefarion commented Dec 5, 2023

I would also like to report a massive slowdown with the new OpenJSON query style:

Query with IN: <1 sec (ssms reports 0)
Query with OpenJson: 7m 40s

In my case there are 1 218 265 rows in the table, and 609 values in the contains query.
The lookup is on the PK of the table -> .Where(x => lookup.Contains(x.PK))
The query returns 423 rows.

I am working around this right now by using .ToHashTable() instead of .ToArray() which thankfully circumvents the OpenJSON style query.

I have no experience with execution plans, but it seems to me as if the OpenJSON is executed for every row the table, which results in significant overhead of json parsing. (738 111 845 rows returned in my case, which is close to 1 218 265 * 609)

@RyanONeill1970
Copy link

RyanONeill1970 commented Dec 5, 2023

Here too, we've refactored to remove any usage of OpenJson as it killed a lot of database connections. Previously less than a second, afterwards we'd get timeouts at 30 seconds.

I've anonymised the generated SQL but generally, it was of the form below. Don't judge the massive parameter list, it's grown over time. I would not do it that way now. The param list has been obfuscated to prevent product codes being identified but looked like '1234567890123,1234567890124, etc...'.

exec sp_executesql N'SELECT [o].[Col1], LTRIM(RTRIM([o].[Col2])), [o].[ProductId], [p].[VariantId], [c].[ColourId], [p].[Name], LTRIM(RTRIM([o].[Col3])), [c].[Col4], [o].[ProductSize], [t0].[ImageId], [t0].[Version], [t0].[c]
FROM (
    SELECT * FROM Product WHERE Col3 IN (@p0)
) AS [o]
INNER JOIN [Variant] AS [p] ON [o].[VariantId] = [p].[VariantId]
INNER JOIN [Colour] AS [c] ON [o].[ColourId] = [c].[ColourId]
LEFT JOIN (
    SELECT [t].[ImageId], [t].[Version], [t].[c], [t].[VariantId], [t].[ColourId]
    FROM (
        SELECT [p0].[VariantImageID] AS [ImageId], [p0].[Version], 1 AS [c], [p0].[VariantId], [p0].[ColourId], ROW_NUMBER() OVER(PARTITION BY [p0].[VariantId], [p0].[ColourId] ORDER BY [p0].[PhotoTypeId]) AS [row]
        FROM [VariantImage] AS [p0]
    ) AS [t]
    WHERE [t].[row] <= 1
) AS [t0] ON [p].[VariantId] = [t0].[VariantId] AND [o].[ColourId] = [t0].[ColourId]
ORDER BY [o].[ProductId], [p].[VariantId], [c].[ColourId]',N'@p0 nvarchar(max) ',@p0=N'   xxxx 36000 chars representing a list of 10 character codes xxxx as in 1234567890123,1234567890124, etc...  '

@roji
Copy link
Member

roji commented Dec 5, 2023

@Nefarion @RyanONeill1970 thanks for your reports - we're actively looking into OPENJSON-related issues at the moment so this is important. However, it's hard to understand the exact source of slowdown from an simple "it regressed" report, without some sort of repro. Is it possible to put together a repro for the problematic query, or at the very least, post the full SQL of the affected query (@RyanONeill1970 your query doesn't seem to contain any OPENJSON)?

@RyanONeill1970
Copy link

Duh, that capture must be after I reworked it. Sorry. It's not just that one, we've replaced a few calls which were going slow.
I'll keep an eye out for any more, we've only just upgraded.

@Nefarion
Copy link

Nefarion commented Dec 6, 2023

@roji

The OpenJSON Query looks like this (anonymized table/colum/db names and query parameters)

DECLARE @__ids_0 nvarchar(max)
SET
    @__ids_0 = '["abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345",,"abcde12345","abcde12345",,"abcde12345",,"abcde12345","abcde12345",,"abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345",null,"abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345","abcde12345"]'
SELECT
    [p].[Col1],
    [p].[Col2],
    [p].[Col3],
    [p].[Col4],
    [p].[Col5],
    [p].[Col6],
    [p].[Col7],
    [p].[Id],
    [p].[Col8],
    [p].[Col9],
    [p].[Col10],
    [p].[Col11],
    [p].[Col12]
FROM
    [Tbl] AS [p]
WHERE
    EXISTS (
        SELECT
            1
        FROM
            OPENJSON(@__ids_0) WITH ([value] nvarchar(450) '$') AS [a]
        WHERE
            [a].[value] = [p].[Id]
            OR (
                [a].[value] IS NULL
                AND [p].[Id] IS NULL
            )
    )

The IN Query looks like this:

SELECT
    [p].[Col1],
    [p].[Col2],
    [p].[Col3],
    [p].[Col4],
    [p].[Col5],
    [p].[Col6],
    [p].[Col7],
    [p].[Id],
    [p].[Col8],
    [p].[Col9],
    [p].[Col10],
    [p].[Col11],
    [p].[Col12]
FROM
    [Tbl] AS [p]
WHERE
    [p].[Id] IN ('abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', null, 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345', 'abcde12345')

QueryPlans:
IN.txt
OpenJSON.txt

Edit: I don't know if it matters, but Tbl is a View in both queries

@molesinski
Copy link

We have the same issue. Performance drop is caused by OPENJSON version making Azure SQL ignore non clustered indexes on large tables and forcing it to fall back to full clustered index scan. I do no have source SQL, but when Contains is replaced from IN construct to OPENJSON construct it just prevents non clustered index usage. The query is simple, just SELECT FROM WHERE with 3 single column conditions in where, 2 of them using Contains, while all 3 columns being present in clustered index.

@nh43de
Copy link

nh43de commented Dec 6, 2023

I'm having this same issue with the Sqlite provider - in .NET 7 it would generate WHERE [Name] IN ( <<names>> ), now in 8 it is generating a WHERE EXISTS plus json parsing. The performance went from milliseconds to now timing out. My guess is now it is doing a table scan instead of being able to rely on an index on the [Name] field. The db query optimizer should in theory be smart enough to handle optimizing with the JSON, but clearly it isn't. On the bright side, at least now it's parameterized for better plan cache utilization.

Here's part of the new Sqlite SQL output for the .Contains translation:

.param set @__searchIndexResults_0 '[ <<names>> ]'

WHERE EXISTS (
    SELECT 1
    FROM json_each(@__searchIndexResults_0) AS "s1"
    WHERE "s1"."value" = "s"."Name" )

To repro this simply use something like this


class User {
    public int UserId { get; set; }
    public string Name { get; set; }
}

Add the above to your context - now query:

            var names = new[] { "name1", "name2" };

            var rr = from user in _context.Users
                    where names.Contains(user.Name)
                    select user;

@nh43de
Copy link

nh43de commented Dec 6, 2023

Ok so an update/workaround - if you can re-write your .Contains() as a join instead, the performance will be almost exactly as before:

            var names = new[] { "name1", "name2" };

            var rr = from user in _context.Users
                    join name in Names
                        on user.Name equals name
                    select user;

Which gets translated to

.param set @__p_0 '[ <<values>> ]'

SELECT ...
FROM "Users" AS "u"
INNER JOIN json_each(@__p_0) AS "p" ON "u"."Name" = "p"."value"

On Sqlite. Performance is fantastic - much better than before! And now it's parameterized so should be even faster than .net7 after query compilation.

--

But now I'm going to have to re-write most of my .Contains(), and this workaround will not work with .Contains() == false. E.g. cases where I want to retreive all users except ones in my list.

@roji
Copy link
Member

roji commented Dec 7, 2023

@nh43de thanks for the information on SQLite I'll look into this as well. Note that I considered INNER JOIN here as well when making the EF changes; but the problem is that if the array contains duplicates, the principal rows get duplicated as well. This makes INNER JOIN unsuitable for Contains unless we do apply Distinct somehow on the client, before sending the parameter to the server.

For everyone else, in the absence of any repro above, I'll try to do some additional perf experimentation on SQL Server; in my investigations last year the performance generally seemed good (slightly slower than IN+constants, but definitely nothing that prevents index usage).

But a simple, minimal repro would go a long way to help with this - this is where your help can be very valuable.

@Nefarion
Copy link

Nefarion commented Dec 7, 2023

@roji

-- Setup Table with PK1 -> PK1000000
SELECT TOP (1000000) Id = Concat('PK', CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id])))
INTO TestTbl
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2

-- Add Primary Key
ALTER TABLE TestTbl
ADD PRIMARY KEY (Id);


-- Fast query (<1 sec)
SELECT [Id] FROM TestTbl WHERE [Id]
IN ('PK345','PK1345','PK2345','PK3345','PK4345','PK5345','PK6345','PK7345','PK8345','PK9345','PK10345','PK11345','PK12345','PK13345','PK14345','PK15345','PK16345','PK17345','PK18345','PK19345','PK20345','PK21345','PK22345','PK23345','PK24345','PK25345','PK26345','PK27345','PK28345','PK29345','PK30345','PK31345','PK32345','PK33345','PK34345','PK35345','PK36345','PK37345','PK38345','PK39345','PK40345','PK41345','PK42345','PK43345','PK44345','PK45345','PK46345','PK47345','PK48345','PK49345','PK50345','PK51345','PK52345','PK53345','PK54345','PK55345','PK56345','PK57345','PK58345','PK59345','PK60345','PK61345','PK62345','PK63345','PK64345','PK65345','PK66345','PK67345','PK68345','PK69345','PK70345','PK71345','PK72345','PK73345','PK74345','PK75345','PK76345','PK77345','PK78345','PK79345','PK80345','PK81345','PK82345','PK83345','PK84345','PK85345','PK86345','PK87345','PK88345','PK89345','PK90345','PK91345','PK92345','PK93345','PK94345','PK95345','PK96345','PK97345','PK98345','PK99345','PK100345','PK101345','PK102345','PK103345','PK104345','PK105345','PK106345','PK107345','PK108345','PK109345','PK110345','PK111345','PK112345','PK113345','PK114345','PK115345','PK116345','PK117345','PK118345','PK119345','PK120345','PK121345','PK122345','PK123345','PK124345','PK125345','PK126345','PK127345','PK128345','PK129345','PK130345','PK131345','PK132345','PK133345','PK134345','PK135345','PK136345','PK137345','PK138345','PK139345','PK140345','PK141345','PK142345','PK143345','PK144345','PK145345','PK146345','PK147345','PK148345','PK149345','PK150345','PK151345','PK152345','PK153345','PK154345','PK155345','PK156345','PK157345','PK158345','PK159345','PK160345','PK161345','PK162345','PK163345','PK164345','PK165345','PK166345','PK167345','PK168345','PK169345','PK170345','PK171345','PK172345','PK173345','PK174345','PK175345','PK176345','PK177345','PK178345','PK179345','PK180345','PK181345','PK182345','PK183345','PK184345','PK185345','PK186345','PK187345','PK188345','PK189345','PK190345','PK191345','PK192345','PK193345','PK194345','PK195345','PK196345','PK197345','PK198345','PK199345','PK200345','PK201345','PK202345','PK203345','PK204345','PK205345','PK206345','PK207345','PK208345','PK209345','PK210345','PK211345','PK212345','PK213345','PK214345','PK215345','PK216345','PK217345','PK218345','PK219345','PK220345','PK221345','PK222345','PK223345','PK224345','PK225345','PK226345','PK227345','PK228345','PK229345','PK230345','PK231345','PK232345','PK233345','PK234345','PK235345','PK236345','PK237345','PK238345','PK239345','PK240345','PK241345','PK242345','PK243345','PK244345','PK245345','PK246345','PK247345','PK248345','PK249345','PK250345','PK251345','PK252345','PK253345','PK254345','PK255345','PK256345','PK257345','PK258345','PK259345','PK260345','PK261345','PK262345','PK263345','PK264345','PK265345','PK266345','PK267345','PK268345','PK269345','PK270345','PK271345','PK272345','PK273345','PK274345','PK275345','PK276345','PK277345','PK278345','PK279345','PK280345','PK281345','PK282345','PK283345','PK284345','PK285345','PK286345','PK287345','PK288345','PK289345','PK290345','PK291345','PK292345','PK293345','PK294345','PK295345','PK296345','PK297345','PK298345','PK299345','PK300345','PK301345','PK302345','PK303345','PK304345','PK305345','PK306345','PK307345','PK308345','PK309345','PK310345','PK311345','PK312345','PK313345','PK314345','PK315345','PK316345','PK317345','PK318345','PK319345','PK320345','PK321345','PK322345','PK323345','PK324345','PK325345','PK326345','PK327345','PK328345','PK329345','PK330345','PK331345','PK332345','PK333345','PK334345','PK335345','PK336345','PK337345','PK338345','PK339345','PK340345','PK341345','PK342345','PK343345','PK344345','PK345345','PK346345','PK347345','PK348345','PK349345','PK350345','PK351345','PK352345','PK353345','PK354345','PK355345','PK356345','PK357345','PK358345','PK359345','PK360345','PK361345','PK362345','PK363345','PK364345','PK365345','PK366345','PK367345','PK368345','PK369345','PK370345','PK371345','PK372345','PK373345','PK374345','PK375345','PK376345','PK377345','PK378345','PK379345','PK380345','PK381345','PK382345','PK383345','PK384345','PK385345','PK386345','PK387345','PK388345','PK389345','PK390345','PK391345','PK392345','PK393345','PK394345','PK395345','PK396345','PK397345','PK398345','PK399345','PK400345','PK401345','PK402345','PK403345','PK404345','PK405345','PK406345','PK407345','PK408345','PK409345','PK410345','PK411345','PK412345','PK413345','PK414345','PK415345','PK416345','PK417345','PK418345','PK419345','PK420345','PK421345','PK422345','PK423345','PK424345','PK425345','PK426345','PK427345','PK428345','PK429345','PK430345','PK431345','PK432345','PK433345','PK434345','PK435345','PK436345','PK437345','PK438345','PK439345','PK440345','PK441345','PK442345','PK443345','PK444345','PK445345','PK446345','PK447345','PK448345','PK449345','PK450345','PK451345','PK452345','PK453345','PK454345','PK455345','PK456345','PK457345','PK458345','PK459345','PK460345','PK461345','PK462345','PK463345','PK464345','PK465345','PK466345','PK467345','PK468345','PK469345','PK470345','PK471345','PK472345','PK473345','PK474345','PK475345','PK476345','PK477345','PK478345','PK479345','PK480345','PK481345','PK482345','PK483345','PK484345','PK485345','PK486345','PK487345','PK488345','PK489345','PK490345','PK491345','PK492345','PK493345','PK494345','PK495345','PK496345','PK497345','PK498345','PK499345','PK500345','PK501345','PK502345','PK503345','PK504345','PK505345','PK506345','PK507345','PK508345','PK509345','PK510345','PK511345','PK512345','PK513345','PK514345','PK515345','PK516345','PK517345','PK518345','PK519345','PK520345','PK521345','PK522345','PK523345','PK524345','PK525345','PK526345','PK527345','PK528345','PK529345','PK530345','PK531345','PK532345','PK533345','PK534345','PK535345','PK536345','PK537345','PK538345','PK539345','PK540345','PK541345','PK542345','PK543345','PK544345','PK545345','PK546345','PK547345','PK548345','PK549345','PK550345','PK551345','PK552345','PK553345','PK554345','PK555345','PK556345','PK557345','PK558345','PK559345','PK560345','PK561345','PK562345','PK563345','PK564345','PK565345','PK566345','PK567345','PK568345','PK569345','PK570345','PK571345','PK572345','PK573345','PK574345','PK575345','PK576345','PK577345','PK578345','PK579345','PK580345','PK581345','PK582345','PK583345','PK584345','PK585345','PK586345','PK587345','PK588345','PK589345','PK590345','PK591345','PK592345','PK593345','PK594345','PK595345','PK596345','PK597345','PK598345','PK599345','PK600345','PK601345','PK602345','PK603345','PK604345','PK605345','PK606345','PK607345','PK608345','PK609345','PK610345','PK611345','PK612345','PK613345','PK614345','PK615345','PK616345','PK617345','PK618345','PK619345','PK620345','PK621345','PK622345','PK623345','PK624345','PK625345','PK626345','PK627345','PK628345','PK629345','PK630345','PK631345','PK632345','PK633345','PK634345','PK635345','PK636345','PK637345','PK638345','PK639345','PK640345','PK641345','PK642345','PK643345','PK644345','PK645345','PK646345','PK647345','PK648345','PK649345','PK650345','PK651345','PK652345','PK653345','PK654345','PK655345','PK656345','PK657345','PK658345','PK659345','PK660345','PK661345','PK662345','PK663345','PK664345','PK665345','PK666345','PK667345','PK668345','PK669345','PK670345','PK671345','PK672345','PK673345','PK674345','PK675345','PK676345','PK677345','PK678345','PK679345','PK680345','PK681345','PK682345','PK683345','PK684345','PK685345','PK686345','PK687345','PK688345','PK689345','PK690345','PK691345','PK692345','PK693345','PK694345','PK695345','PK696345','PK697345','PK698345','PK699345','PK700345','PK701345','PK702345','PK703345','PK704345','PK705345','PK706345','PK707345','PK708345','PK709345','PK710345','PK711345','PK712345','PK713345','PK714345','PK715345','PK716345','PK717345','PK718345','PK719345','PK720345','PK721345','PK722345','PK723345','PK724345','PK725345','PK726345','PK727345','PK728345','PK729345','PK730345','PK731345','PK732345','PK733345','PK734345','PK735345','PK736345','PK737345','PK738345','PK739345','PK740345','PK741345','PK742345','PK743345','PK744345','PK745345','PK746345','PK747345','PK748345','PK749345','PK750345','PK751345','PK752345','PK753345','PK754345','PK755345','PK756345','PK757345','PK758345','PK759345','PK760345','PK761345','PK762345','PK763345','PK764345','PK765345','PK766345','PK767345','PK768345','PK769345','PK770345','PK771345','PK772345','PK773345','PK774345','PK775345','PK776345','PK777345','PK778345','PK779345','PK780345','PK781345','PK782345','PK783345','PK784345','PK785345','PK786345','PK787345','PK788345','PK789345','PK790345','PK791345','PK792345','PK793345','PK794345','PK795345','PK796345','PK797345','PK798345','PK799345','PK800345','PK801345','PK802345','PK803345','PK804345','PK805345','PK806345','PK807345','PK808345','PK809345','PK810345','PK811345','PK812345','PK813345','PK814345','PK815345','PK816345','PK817345','PK818345','PK819345','PK820345','PK821345','PK822345','PK823345','PK824345','PK825345','PK826345','PK827345','PK828345','PK829345','PK830345','PK831345','PK832345','PK833345','PK834345','PK835345','PK836345','PK837345','PK838345','PK839345','PK840345','PK841345','PK842345','PK843345','PK844345','PK845345','PK846345','PK847345','PK848345','PK849345','PK850345','PK851345','PK852345','PK853345','PK854345','PK855345','PK856345','PK857345','PK858345','PK859345','PK860345','PK861345','PK862345','PK863345','PK864345','PK865345','PK866345','PK867345','PK868345','PK869345','PK870345','PK871345','PK872345','PK873345','PK874345','PK875345','PK876345','PK877345','PK878345','PK879345','PK880345','PK881345','PK882345','PK883345','PK884345','PK885345','PK886345','PK887345','PK888345','PK889345','PK890345','PK891345','PK892345','PK893345','PK894345','PK895345','PK896345','PK897345','PK898345','PK899345','PK900345','PK901345','PK902345','PK903345','PK904345','PK905345','PK906345','PK907345','PK908345','PK909345','PK910345','PK911345','PK912345','PK913345','PK914345','PK915345','PK916345','PK917345','PK918345','PK919345','PK920345','PK921345','PK922345','PK923345','PK924345','PK925345','PK926345','PK927345','PK928345','PK929345','PK930345','PK931345','PK932345','PK933345','PK934345','PK935345','PK936345','PK937345','PK938345','PK939345','PK940345','PK941345','PK942345','PK943345','PK944345','PK945345','PK946345','PK947345','PK948345','PK949345','PK950345','PK951345','PK952345','PK953345','PK954345','PK955345','PK956345','PK957345','PK958345','PK959345','PK960345','PK961345','PK962345','PK963345','PK964345','PK965345','PK966345','PK967345','PK968345','PK969345','PK970345','PK971345','PK972345','PK973345','PK974345','PK975345','PK976345','PK977345','PK978345','PK979345','PK980345','PK981345','PK982345','PK983345','PK984345','PK985345','PK986345','PK987345','PK988345','PK989345','PK990345','PK991345','PK992345','PK993345','PK994345','PK995345','PK996345','PK997345','PK998345','PK999345')


-- EFCore OpenJson Query (10min 08sec)
DECLARE @__ids_0 nvarchar(max)
SET
    @__ids_0 = '["PK345","PK1345","PK2345","PK3345","PK4345","PK5345","PK6345","PK7345","PK8345","PK9345","PK10345","PK11345","PK12345","PK13345","PK14345","PK15345","PK16345","PK17345","PK18345","PK19345","PK20345","PK21345","PK22345","PK23345","PK24345","PK25345","PK26345","PK27345","PK28345","PK29345","PK30345","PK31345","PK32345","PK33345","PK34345","PK35345","PK36345","PK37345","PK38345","PK39345","PK40345","PK41345","PK42345","PK43345","PK44345","PK45345","PK46345","PK47345","PK48345","PK49345","PK50345","PK51345","PK52345","PK53345","PK54345","PK55345","PK56345","PK57345","PK58345","PK59345","PK60345","PK61345","PK62345","PK63345","PK64345","PK65345","PK66345","PK67345","PK68345","PK69345","PK70345","PK71345","PK72345","PK73345","PK74345","PK75345","PK76345","PK77345","PK78345","PK79345","PK80345","PK81345","PK82345","PK83345","PK84345","PK85345","PK86345","PK87345","PK88345","PK89345","PK90345","PK91345","PK92345","PK93345","PK94345","PK95345","PK96345","PK97345","PK98345","PK99345","PK100345","PK101345","PK102345","PK103345","PK104345","PK105345","PK106345","PK107345","PK108345","PK109345","PK110345","PK111345","PK112345","PK113345","PK114345","PK115345","PK116345","PK117345","PK118345","PK119345","PK120345","PK121345","PK122345","PK123345","PK124345","PK125345","PK126345","PK127345","PK128345","PK129345","PK130345","PK131345","PK132345","PK133345","PK134345","PK135345","PK136345","PK137345","PK138345","PK139345","PK140345","PK141345","PK142345","PK143345","PK144345","PK145345","PK146345","PK147345","PK148345","PK149345","PK150345","PK151345","PK152345","PK153345","PK154345","PK155345","PK156345","PK157345","PK158345","PK159345","PK160345","PK161345","PK162345","PK163345","PK164345","PK165345","PK166345","PK167345","PK168345","PK169345","PK170345","PK171345","PK172345","PK173345","PK174345","PK175345","PK176345","PK177345","PK178345","PK179345","PK180345","PK181345","PK182345","PK183345","PK184345","PK185345","PK186345","PK187345","PK188345","PK189345","PK190345","PK191345","PK192345","PK193345","PK194345","PK195345","PK196345","PK197345","PK198345","PK199345","PK200345","PK201345","PK202345","PK203345","PK204345","PK205345","PK206345","PK207345","PK208345","PK209345","PK210345","PK211345","PK212345","PK213345","PK214345","PK215345","PK216345","PK217345","PK218345","PK219345","PK220345","PK221345","PK222345","PK223345","PK224345","PK225345","PK226345","PK227345","PK228345","PK229345","PK230345","PK231345","PK232345","PK233345","PK234345","PK235345","PK236345","PK237345","PK238345","PK239345","PK240345","PK241345","PK242345","PK243345","PK244345","PK245345","PK246345","PK247345","PK248345","PK249345","PK250345","PK251345","PK252345","PK253345","PK254345","PK255345","PK256345","PK257345","PK258345","PK259345","PK260345","PK261345","PK262345","PK263345","PK264345","PK265345","PK266345","PK267345","PK268345","PK269345","PK270345","PK271345","PK272345","PK273345","PK274345","PK275345","PK276345","PK277345","PK278345","PK279345","PK280345","PK281345","PK282345","PK283345","PK284345","PK285345","PK286345","PK287345","PK288345","PK289345","PK290345","PK291345","PK292345","PK293345","PK294345","PK295345","PK296345","PK297345","PK298345","PK299345","PK300345","PK301345","PK302345","PK303345","PK304345","PK305345","PK306345","PK307345","PK308345","PK309345","PK310345","PK311345","PK312345","PK313345","PK314345","PK315345","PK316345","PK317345","PK318345","PK319345","PK320345","PK321345","PK322345","PK323345","PK324345","PK325345","PK326345","PK327345","PK328345","PK329345","PK330345","PK331345","PK332345","PK333345","PK334345","PK335345","PK336345","PK337345","PK338345","PK339345","PK340345","PK341345","PK342345","PK343345","PK344345","PK345345","PK346345","PK347345","PK348345","PK349345","PK350345","PK351345","PK352345","PK353345","PK354345","PK355345","PK356345","PK357345","PK358345","PK359345","PK360345","PK361345","PK362345","PK363345","PK364345","PK365345","PK366345","PK367345","PK368345","PK369345","PK370345","PK371345","PK372345","PK373345","PK374345","PK375345","PK376345","PK377345","PK378345","PK379345","PK380345","PK381345","PK382345","PK383345","PK384345","PK385345","PK386345","PK387345","PK388345","PK389345","PK390345","PK391345","PK392345","PK393345","PK394345","PK395345","PK396345","PK397345","PK398345","PK399345","PK400345","PK401345","PK402345","PK403345","PK404345","PK405345","PK406345","PK407345","PK408345","PK409345","PK410345","PK411345","PK412345","PK413345","PK414345","PK415345","PK416345","PK417345","PK418345","PK419345","PK420345","PK421345","PK422345","PK423345","PK424345","PK425345","PK426345","PK427345","PK428345","PK429345","PK430345","PK431345","PK432345","PK433345","PK434345","PK435345","PK436345","PK437345","PK438345","PK439345","PK440345","PK441345","PK442345","PK443345","PK444345","PK445345","PK446345","PK447345","PK448345","PK449345","PK450345","PK451345","PK452345","PK453345","PK454345","PK455345","PK456345","PK457345","PK458345","PK459345","PK460345","PK461345","PK462345","PK463345","PK464345","PK465345","PK466345","PK467345","PK468345","PK469345","PK470345","PK471345","PK472345","PK473345","PK474345","PK475345","PK476345","PK477345","PK478345","PK479345","PK480345","PK481345","PK482345","PK483345","PK484345","PK485345","PK486345","PK487345","PK488345","PK489345","PK490345","PK491345","PK492345","PK493345","PK494345","PK495345","PK496345","PK497345","PK498345","PK499345","PK500345","PK501345","PK502345","PK503345","PK504345","PK505345","PK506345","PK507345","PK508345","PK509345","PK510345","PK511345","PK512345","PK513345","PK514345","PK515345","PK516345","PK517345","PK518345","PK519345","PK520345","PK521345","PK522345","PK523345","PK524345","PK525345","PK526345","PK527345","PK528345","PK529345","PK530345","PK531345","PK532345","PK533345","PK534345","PK535345","PK536345","PK537345","PK538345","PK539345","PK540345","PK541345","PK542345","PK543345","PK544345","PK545345","PK546345","PK547345","PK548345","PK549345","PK550345","PK551345","PK552345","PK553345","PK554345","PK555345","PK556345","PK557345","PK558345","PK559345","PK560345","PK561345","PK562345","PK563345","PK564345","PK565345","PK566345","PK567345","PK568345","PK569345","PK570345","PK571345","PK572345","PK573345","PK574345","PK575345","PK576345","PK577345","PK578345","PK579345","PK580345","PK581345","PK582345","PK583345","PK584345","PK585345","PK586345","PK587345","PK588345","PK589345","PK590345","PK591345","PK592345","PK593345","PK594345","PK595345","PK596345","PK597345","PK598345","PK599345","PK600345","PK601345","PK602345","PK603345","PK604345","PK605345","PK606345","PK607345","PK608345","PK609345","PK610345","PK611345","PK612345","PK613345","PK614345","PK615345","PK616345","PK617345","PK618345","PK619345","PK620345","PK621345","PK622345","PK623345","PK624345","PK625345","PK626345","PK627345","PK628345","PK629345","PK630345","PK631345","PK632345","PK633345","PK634345","PK635345","PK636345","PK637345","PK638345","PK639345","PK640345","PK641345","PK642345","PK643345","PK644345","PK645345","PK646345","PK647345","PK648345","PK649345","PK650345","PK651345","PK652345","PK653345","PK654345","PK655345","PK656345","PK657345","PK658345","PK659345","PK660345","PK661345","PK662345","PK663345","PK664345","PK665345","PK666345","PK667345","PK668345","PK669345","PK670345","PK671345","PK672345","PK673345","PK674345","PK675345","PK676345","PK677345","PK678345","PK679345","PK680345","PK681345","PK682345","PK683345","PK684345","PK685345","PK686345","PK687345","PK688345","PK689345","PK690345","PK691345","PK692345","PK693345","PK694345","PK695345","PK696345","PK697345","PK698345","PK699345","PK700345","PK701345","PK702345","PK703345","PK704345","PK705345","PK706345","PK707345","PK708345","PK709345","PK710345","PK711345","PK712345","PK713345","PK714345","PK715345","PK716345","PK717345","PK718345","PK719345","PK720345","PK721345","PK722345","PK723345","PK724345","PK725345","PK726345","PK727345","PK728345","PK729345","PK730345","PK731345","PK732345","PK733345","PK734345","PK735345","PK736345","PK737345","PK738345","PK739345","PK740345","PK741345","PK742345","PK743345","PK744345","PK745345","PK746345","PK747345","PK748345","PK749345","PK750345","PK751345","PK752345","PK753345","PK754345","PK755345","PK756345","PK757345","PK758345","PK759345","PK760345","PK761345","PK762345","PK763345","PK764345","PK765345","PK766345","PK767345","PK768345","PK769345","PK770345","PK771345","PK772345","PK773345","PK774345","PK775345","PK776345","PK777345","PK778345","PK779345","PK780345","PK781345","PK782345","PK783345","PK784345","PK785345","PK786345","PK787345","PK788345","PK789345","PK790345","PK791345","PK792345","PK793345","PK794345","PK795345","PK796345","PK797345","PK798345","PK799345","PK800345","PK801345","PK802345","PK803345","PK804345","PK805345","PK806345","PK807345","PK808345","PK809345","PK810345","PK811345","PK812345","PK813345","PK814345","PK815345","PK816345","PK817345","PK818345","PK819345","PK820345","PK821345","PK822345","PK823345","PK824345","PK825345","PK826345","PK827345","PK828345","PK829345","PK830345","PK831345","PK832345","PK833345","PK834345","PK835345","PK836345","PK837345","PK838345","PK839345","PK840345","PK841345","PK842345","PK843345","PK844345","PK845345","PK846345","PK847345","PK848345","PK849345","PK850345","PK851345","PK852345","PK853345","PK854345","PK855345","PK856345","PK857345","PK858345","PK859345","PK860345","PK861345","PK862345","PK863345","PK864345","PK865345","PK866345","PK867345","PK868345","PK869345","PK870345","PK871345","PK872345","PK873345","PK874345","PK875345","PK876345","PK877345","PK878345","PK879345","PK880345","PK881345","PK882345","PK883345","PK884345","PK885345","PK886345","PK887345","PK888345","PK889345","PK890345","PK891345","PK892345","PK893345","PK894345","PK895345","PK896345","PK897345","PK898345","PK899345","PK900345","PK901345","PK902345","PK903345","PK904345","PK905345","PK906345","PK907345","PK908345","PK909345","PK910345","PK911345","PK912345","PK913345","PK914345","PK915345","PK916345","PK917345","PK918345","PK919345","PK920345","PK921345","PK922345","PK923345","PK924345","PK925345","PK926345","PK927345","PK928345","PK929345","PK930345","PK931345","PK932345","PK933345","PK934345","PK935345","PK936345","PK937345","PK938345","PK939345","PK940345","PK941345","PK942345","PK943345","PK944345","PK945345","PK946345","PK947345","PK948345","PK949345","PK950345","PK951345","PK952345","PK953345","PK954345","PK955345","PK956345","PK957345","PK958345","PK959345","PK960345","PK961345","PK962345","PK963345","PK964345","PK965345","PK966345","PK967345","PK968345","PK969345","PK970345","PK971345","PK972345","PK973345","PK974345","PK975345","PK976345","PK977345","PK978345","PK979345","PK980345","PK981345","PK982345","PK983345","PK984345","PK985345","PK986345","PK987345","PK988345","PK989345","PK990345","PK991345","PK992345","PK993345","PK994345","PK995345","PK996345","PK997345","PK998345","PK999345"]'

SELECT [Id] FROM TestTbl as p WHERE
EXISTS (SELECT 1 FROM OPENJSON(@__ids_0) WITH ([value] nvarchar(450) '$') AS [a]
        WHERE [a].[value] = [p].[Id] OR ([a].[value] IS NULL AND [p].[Id] IS NULL)
)


-- Drop Table
DROP TABLE [TestTbl]

@roji roji added this to the 9.0.0 milestone Mar 11, 2024
@roji
Copy link
Member

roji commented Mar 11, 2024

Putting this in the 9.0 milestone; once I'm done with some other matters I'll focus on primitive collections and on this issue.

@asabla
Copy link

asabla commented Mar 25, 2024

We've ran into the same issue as well. I feel you already have enough examples to work from @roji or do you want another one?

@roji
Copy link
Member

roji commented Mar 26, 2024

@asabla thanks... If your affected query differs from what has already been posted above (and also if it isn't on SQL Server), then yeah, getting some details could be helpful, just to have more context on exactly which situations trigger this.

@rmwin
Copy link

rmwin commented Apr 4, 2024

@roji
I just got here now. I'm more of a SQL Server person than a developer. It looks like the entire issue is caused by your contains, which is leading to a WHERE EXISTS. This, at least in the latest version, is causing SQL Server to take a dive into the deep end of performance hell and scan the base table and loop join into the JSON. In most cases, likely your test cases too, this will not matter. It will become an issue when you have a large base table. Say you have 10,000 records in the base table and scan 500 records in the JSON then you'll go through 5,000,000 records in total. That's a lot but very manageable for a modern CPU. Make that main table 1,000 times bigger and you'll be 1,000x slower.

If you use a JOIN (SELECT DISTINCT [value] FROM OPENJSON) then you can get a loop join from the JSON into the base table instead. You might also be able to process the number of records in the JSON first, and if larger than say 5,000 or by developer option, offer up the SQL Server hint of WITH (FORCESCAN) on the base table. This will then do a scan and, likely, hash join into the base table. This should still be pretty fast. Of course it would be better to peak the size of the JSON but that isn't as likely.

Here is a simple repro based on what someone else created earlier that might help:

-- Setup Table with PK1 -> PK1000000
SELECT TOP (1000000) Id = Concat('PK', CONVERT(INT, ROW_NUMBER() OVER (ORDER BY s1.[object_id])))
INTO #TestTbl
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2

-- Add Primary Key
ALTER TABLE #TestTbl
ADD PRIMARY KEY (Id);

DECLARE @__ids_0 nvarchar(max)
SET
@__ids_0 = '["PK345","PK1345","PK2345","PK3345","PK4345","PK5345","PK6345","PK7345","PK8345","PK9345","PK10345","PK11345","PK12345","PK13345","PK14345","PK15345","PK16345","PK17345","PK18345","PK19345","PK20345","PK21345","PK22345","PK23345","PK24345","PK25345","PK26345","PK27345","PK28345","PK29345","PK30345","PK31345","PK32345","PK33345","PK34345","PK35345","PK36345","PK37345","PK38345","PK39345","PK40345","PK41345","PK42345","PK43345","PK44345","PK45345","PK46345","PK47345","PK48345","PK49345","PK50345","PK51345","PK52345","PK53345","PK54345","PK55345","PK56345","PK57345","PK58345","PK59345","PK60345","PK61345","PK62345","PK63345","PK64345","PK65345","PK66345","PK67345","PK68345","PK69345","PK70345","PK71345","PK72345","PK73345","PK74345","PK75345","PK76345","PK77345","PK78345","PK79345","PK80345","PK81345","PK82345","PK83345","PK84345","PK85345","PK86345","PK87345","PK88345","PK89345","PK90345","PK91345","PK92345","PK93345","PK94345","PK95345","PK96345","PK97345","PK98345","PK99345","PK100345","PK101345","PK102345","PK103345","PK104345","PK105345","PK106345","PK107345","PK108345","PK109345","PK110345","PK111345","PK112345","PK113345","PK114345","PK115345","PK116345","PK117345","PK118345","PK119345","PK120345","PK121345","PK122345","PK123345","PK124345","PK125345","PK126345","PK127345","PK128345","PK129345","PK130345","PK131345","PK132345","PK133345","PK134345","PK135345","PK136345","PK137345","PK138345","PK139345","PK140345","PK141345","PK142345","PK143345","PK144345","PK145345","PK146345","PK147345","PK148345","PK149345","PK150345","PK151345","PK152345","PK153345","PK154345","PK155345","PK156345","PK157345","PK158345","PK159345","PK160345","PK161345","PK162345","PK163345","PK164345","PK165345","PK166345","PK167345","PK168345","PK169345","PK170345","PK171345","PK172345","PK173345","PK174345","PK175345","PK176345","PK177345","PK178345","PK179345","PK180345","PK181345","PK182345","PK183345","PK184345","PK185345","PK186345","PK187345","PK188345","PK189345","PK190345","PK191345","PK192345","PK193345","PK194345","PK195345","PK196345","PK197345","PK198345","PK199345","PK200345","PK201345","PK202345","PK203345","PK204345","PK205345","PK206345","PK207345","PK208345","PK209345","PK210345","PK211345","PK212345","PK213345","PK214345","PK215345","PK216345","PK217345","PK218345","PK219345","PK220345","PK221345","PK222345","PK223345","PK224345","PK225345","PK226345","PK227345","PK228345","PK229345","PK230345","PK231345","PK232345","PK233345","PK234345","PK235345","PK236345","PK237345","PK238345","PK239345","PK240345","PK241345","PK242345","PK243345","PK244345","PK245345","PK246345","PK247345","PK248345","PK249345","PK250345","PK251345","PK252345","PK253345","PK254345","PK255345","PK256345","PK257345","PK258345","PK259345","PK260345","PK261345","PK262345","PK263345","PK264345","PK265345","PK266345","PK267345","PK268345","PK269345","PK270345","PK271345","PK272345","PK273345","PK274345","PK275345","PK276345","PK277345","PK278345","PK279345","PK280345","PK281345","PK282345","PK283345","PK284345","PK285345","PK286345","PK287345","PK288345","PK289345","PK290345","PK291345","PK292345","PK293345","PK294345","PK295345","PK296345","PK297345","PK298345","PK299345","PK300345","PK301345","PK302345","PK303345","PK304345","PK305345","PK306345","PK307345","PK308345","PK309345","PK310345","PK311345","PK312345","PK313345","PK314345","PK315345","PK316345","PK317345","PK318345","PK319345","PK320345","PK321345","PK322345","PK323345","PK324345","PK325345","PK326345","PK327345","PK328345","PK329345","PK330345","PK331345","PK332345","PK333345","PK334345","PK335345","PK336345","PK337345","PK338345","PK339345","PK340345","PK341345","PK342345","PK343345","PK344345","PK345345","PK346345","PK347345","PK348345","PK349345","PK350345","PK351345","PK352345","PK353345","PK354345","PK355345","PK356345","PK357345","PK358345","PK359345","PK360345","PK361345","PK362345","PK363345","PK364345","PK365345","PK366345","PK367345","PK368345","PK369345","PK370345","PK371345","PK372345","PK373345","PK374345","PK375345","PK376345","PK377345","PK378345","PK379345","PK380345","PK381345","PK382345","PK383345","PK384345","PK385345","PK386345","PK387345","PK388345","PK389345","PK390345","PK391345","PK392345","PK393345","PK394345","PK395345","PK396345","PK397345","PK398345","PK399345","PK400345","PK401345","PK402345","PK403345","PK404345","PK405345","PK406345","PK407345","PK408345","PK409345","PK410345","PK411345","PK412345","PK413345","PK414345","PK415345","PK416345","PK417345","PK418345","PK419345","PK420345","PK421345","PK422345","PK423345","PK424345","PK425345","PK426345","PK427345","PK428345","PK429345","PK430345","PK431345","PK432345","PK433345","PK434345","PK435345","PK436345","PK437345","PK438345","PK439345","PK440345","PK441345","PK442345","PK443345","PK444345","PK445345","PK446345","PK447345","PK448345","PK449345","PK450345","PK451345","PK452345","PK453345","PK454345","PK455345","PK456345","PK457345","PK458345","PK459345","PK460345","PK461345","PK462345","PK463345","PK464345","PK465345","PK466345","PK467345","PK468345","PK469345","PK470345","PK471345","PK472345","PK473345","PK474345","PK475345","PK476345","PK477345","PK478345","PK479345","PK480345","PK481345","PK482345","PK483345","PK484345","PK485345","PK486345","PK487345","PK488345","PK489345","PK490345","PK491345","PK492345","PK493345","PK494345","PK495345","PK496345","PK497345","PK498345","PK499345","PK500345","PK501345","PK502345","PK503345","PK504345","PK505345","PK506345","PK507345","PK508345","PK509345","PK510345","PK511345","PK512345","PK513345","PK514345","PK515345","PK516345","PK517345","PK518345","PK519345","PK520345","PK521345","PK522345","PK523345","PK524345","PK525345","PK526345","PK527345","PK528345","PK529345","PK530345","PK531345","PK532345","PK533345","PK534345","PK535345","PK536345","PK537345","PK538345","PK539345","PK540345","PK541345","PK542345","PK543345","PK544345","PK545345","PK546345","PK547345","PK548345","PK549345","PK550345","PK551345","PK552345","PK553345","PK554345","PK555345","PK556345","PK557345","PK558345","PK559345","PK560345","PK561345","PK562345","PK563345","PK564345","PK565345","PK566345","PK567345","PK568345","PK569345","PK570345","PK571345","PK572345","PK573345","PK574345","PK575345","PK576345","PK577345","PK578345","PK579345","PK580345","PK581345","PK582345","PK583345","PK584345","PK585345","PK586345","PK587345","PK588345","PK589345","PK590345","PK591345","PK592345","PK593345","PK594345","PK595345","PK596345","PK597345","PK598345","PK599345","PK600345","PK601345","PK602345","PK603345","PK604345","PK605345","PK606345","PK607345","PK608345","PK609345","PK610345","PK611345","PK612345","PK613345","PK614345","PK615345","PK616345","PK617345","PK618345","PK619345","PK620345","PK621345","PK622345","PK623345","PK624345","PK625345","PK626345","PK627345","PK628345","PK629345","PK630345","PK631345","PK632345","PK633345","PK634345","PK635345","PK636345","PK637345","PK638345","PK639345","PK640345","PK641345","PK642345","PK643345","PK644345","PK645345","PK646345","PK647345","PK648345","PK649345","PK650345","PK651345","PK652345","PK653345","PK654345","PK655345","PK656345","PK657345","PK658345","PK659345","PK660345","PK661345","PK662345","PK663345","PK664345","PK665345","PK666345","PK667345","PK668345","PK669345","PK670345","PK671345","PK672345","PK673345","PK674345","PK675345","PK676345","PK677345","PK678345","PK679345","PK680345","PK681345","PK682345","PK683345","PK684345","PK685345","PK686345","PK687345","PK688345","PK689345","PK690345","PK691345","PK692345","PK693345","PK694345","PK695345","PK696345","PK697345","PK698345","PK699345","PK700345","PK701345","PK702345","PK703345","PK704345","PK705345","PK706345","PK707345","PK708345","PK709345","PK710345","PK711345","PK712345","PK713345","PK714345","PK715345","PK716345","PK717345","PK718345","PK719345","PK720345","PK721345","PK722345","PK723345","PK724345","PK725345","PK726345","PK727345","PK728345","PK729345","PK730345","PK731345","PK732345","PK733345","PK734345","PK735345","PK736345","PK737345","PK738345","PK739345","PK740345","PK741345","PK742345","PK743345","PK744345","PK745345","PK746345","PK747345","PK748345","PK749345","PK750345","PK751345","PK752345","PK753345","PK754345","PK755345","PK756345","PK757345","PK758345","PK759345","PK760345","PK761345","PK762345","PK763345","PK764345","PK765345","PK766345","PK767345","PK768345","PK769345","PK770345","PK771345","PK772345","PK773345","PK774345","PK775345","PK776345","PK777345","PK778345","PK779345","PK780345","PK781345","PK782345","PK783345","PK784345","PK785345","PK786345","PK787345","PK788345","PK789345","PK790345","PK791345","PK792345","PK793345","PK794345","PK795345","PK796345","PK797345","PK798345","PK799345","PK800345","PK801345","PK802345","PK803345","PK804345","PK805345","PK806345","PK807345","PK808345","PK809345","PK810345","PK811345","PK812345","PK813345","PK814345","PK815345","PK816345","PK817345","PK818345","PK819345","PK820345","PK821345","PK822345","PK823345","PK824345","PK825345","PK826345","PK827345","PK828345","PK829345","PK830345","PK831345","PK832345","PK833345","PK834345","PK835345","PK836345","PK837345","PK838345","PK839345","PK840345","PK841345","PK842345","PK843345","PK844345","PK845345","PK846345","PK847345","PK848345","PK849345","PK850345","PK851345","PK852345","PK853345","PK854345","PK855345","PK856345","PK857345","PK858345","PK859345","PK860345","PK861345","PK862345","PK863345","PK864345","PK865345","PK866345","PK867345","PK868345","PK869345","PK870345","PK871345","PK872345","PK873345","PK874345","PK875345","PK876345","PK877345","PK878345","PK879345","PK880345","PK881345","PK882345","PK883345","PK884345","PK885345","PK886345","PK887345","PK888345","PK889345","PK890345","PK891345","PK892345","PK893345","PK894345","PK895345","PK896345","PK897345","PK898345","PK899345","PK900345","PK901345","PK902345","PK903345","PK904345","PK905345","PK906345","PK907345","PK908345","PK909345","PK910345","PK911345","PK912345","PK913345","PK914345","PK915345","PK916345","PK917345","PK918345","PK919345","PK920345","PK921345","PK922345","PK923345","PK924345","PK925345","PK926345","PK927345","PK928345","PK929345","PK930345","PK931345","PK932345","PK933345","PK934345","PK935345","PK936345","PK937345","PK938345","PK939345","PK940345","PK941345","PK942345","PK943345","PK944345","PK945345","PK946345","PK947345","PK948345","PK949345","PK950345","PK951345","PK952345","PK953345","PK954345","PK955345","PK956345","PK957345","PK958345","PK959345","PK960345","PK961345","PK962345","PK963345","PK964345","PK965345","PK966345","PK967345","PK968345","PK969345","PK970345","PK971345","PK972345","PK973345","PK974345","PK975345","PK976345","PK977345","PK978345","PK979345","PK980345","PK981345","PK982345","PK983345","PK984345","PK985345","PK986345","PK987345","PK988345","PK989345","PK990345","PK991345","PK992345","PK993345","PK994345","PK995345","PK996345","PK997345","PK998345","PK999345"]'

-- Using inner join with select distinct
SELECT p.[Id]
FROM #TestTbl as p
INNER JOIN (SELECT DISTINCT [a].[value] FROM OPENJSON(@__ids_0) WITH ([value] nvarchar(450) '$') AS [a]) AS v([value])
ON [v].[value] = [p].[Id] OR ([v].[value] IS NULL AND [p].[Id] IS NULL)

Around 20ms with the plan
image

-- If a forcescan can be offered as an option
SELECT p.[Id]
FROM #TestTbl as p WITH(FORCESCAN) -- For large JSON files
INNER JOIN (SELECT DISTINCT [a].[value] FROM OPENJSON(@__ids_0) WITH ([value] nvarchar(450) '$') AS [a]) AS v([value])
ON [v].[value] = [p].[Id] OR ([v].[value] IS NULL AND [p].[Id] IS NULL)

Around 200ms with plan
image

-- Current EF Core 8 behaviour
SELECT [Id] FROM #TestTbl as p WHERE
EXISTS (SELECT [a].[value] FROM OPENJSON(@__ids_0) WITH ([value] nvarchar(450) '$') AS [a]
WHERE [a].[value] = [p].[Id] OR ([a].[value] IS NULL AND [p].[Id] IS NULL)
)

Around 5 minutes with plan
image

-- Drop Table
DROP TABLE [#TestTbl]

I hope that helps somewhat not sure if it adds something or not though as I got a little lost in all the reading this afternoon.

@roji
Copy link
Member

roji commented Apr 4, 2024

@rmwin thanks for the added input.

First, the SQL you quote as the current 8 SQL:

SELECT [Id] FROM #TestTbl as p WHERE
EXISTS (SELECT [a].[value] FROM OPENJSON(@__ids_0) WITH ([value] nvarchar(450) '$') AS [a]
WHERE [a].[value] = [p].[Id] OR ([a].[value] IS NULL AND [p].[Id] IS NULL)

Which exact version of EF Core are you using? For 8.0.2 I implemented #32574, which should cause that SQL to no longer get produced. The more optimized variant produced by 8.0 using an uncorrelated IN+subquery (instead of a correlated EXISTS+subquery) - this way already produced in 8.0 when the column in question was non-nullable, but since 8.0.2 should also be produced when the column is nullable. Can you please double-check and confirm exactly what's going on, and report on the performance you're seeing with the better, uncorrelated IN variant?

Specifically re your suggestion to use INNER JOIN instead of EXISTS/IN+subquery (this is something we discussed); one problem there is that if the JSON array contains duplicates, result rows are duplicated as well. So for this to work without changing results EF would need to perform a client-side distinct on the values before sending the JSON array parameter, which is additional complexity and something EF doesn't currently support.

@rmwin
Copy link

rmwin commented Apr 5, 2024

@roji,
TLDR:

  • We are fine with the latest version, my previous info was a little out of date.
  • I think that the INNER JOIN can work for you and give you a few more options as long as you use DISTINCT as well.

just checked again with our developers and we are getting the IN + subquery. I was previously told that we got the exists but they've probably updated EFC since then. The performance is definitely better with the IN.

I certainly accept that duplicates could be an issue with an INNER JOIN. In the query I proposed there is a DISTINCT to prevent this. It was still faster, in my testing, than the IN. It also leaves open the option to add a hint to force a scan of the base table, which can be more efficient in some cases. Just a thought in case you want to include it at some point. A FORCESCAN would likely lead to a hash join, rather than a loop join. If you know that you will have a large list of values in your JSON and the base table is not too big, then this could lead to surprising performance improvements. In general SQL Server will choose a scan over a seek once that you get to around 2%-3% of records in the table being read. The number of trips through the index tree makes the index seek, and potentially lookup, not worth the effort. If you can use the JSON to go directly into a clustered index then you would likely see an improvement with a scan once you get around 10%-20% of the table accessed, depending on table size. Hence I thought it might be a nice option to add to help developers who know when to force a scan :)

@roji
Copy link
Member

roji commented Apr 5, 2024

@rmwin thanks for going back and reconfirming - good to hear that the latest version performs well for you.

First, we've been in discussions with SQL Server, where it may be possible for the query plan to have better row estimates for the number of rows coming out of OPENJSON; the opacity of OPENJSON to the planner is actually the source of the remaining performance issues being reported above (if my understanding is correct) - SQL Server chooses the wrong plan based on a wrong assumption of how many rows would come out of the function, although it could have the relevant information to know that.

Because of this, I'm slightly reluctant to add a DISTINCT to an OPENJSON subquery, as this may yet again mess with the planner's ability to choose the right plan around OPENJSON - but that may be unfounded. I'd in any case be interested in a clear test-case showing that INNER JOIN over OPENJSON (ideally with DISTINCT) performs better than the current uncorrelated IN+subquery feature - is this something you might be able to do?

Finally, re FORCESCAN... SQL Server query hints are generally covered via #6717. We generally haven't prioritized hints up to now since they are generally discouraged and documented for use only as a last resort (docs), and using EF generally implies accepting some level of abstraction without necessarily controlling every query detail (after all, LINQ is being used to express the query - for specific queries where very specific SQL is required, it's recommended to simply use SQL directly and use EF to materialize the results). Allowing users to specify hints on arbitrary LINQ query parts is also not always easy to enable, for various reasons.

So while I don't think we're opposed to allowing expressing query hints - but we'd rather do our best to have things "just work" without forcing users into low-level tinkering, if at all possible (which mirrors the approach of SQL Server itself).

@Jared-Rem
Copy link

Jared-Rem commented May 17, 2024

@Suchiman I was able work around the issue you described (without the drawbacks of EF.Constant or forcing a lower compatibility level using UseCompatibilityLevel(120)) by joining against the in-memory collection as follows:

from ak in Db.AmsKunden
join pId in profileIds on ak.Profile_Id equals pId
select ...

While this does improve performance, there is an entity framework bug (at least for v8.0.4) that is worth noting. If you use .AsSplitQuery(), the SQL generated for the "split queries" will be invalid due to the different approach to parsing JSON values shown below.

Root Query:
INNER JOIN OPENJSON(__profileIds_0) AS [p] ON [a].[Profile_Id] = CAST([p].[value] AS int)

Split Queries:
INNER JOIN OPENJSON(__profileIds_0) WITH ([value] int '$') AS [p] ON [a].[Profile_Id] = [p].[value]

Due to this difference, the split query's SELECT and ORDER BY statements will reference an invalid column and the following exception will be thrown:

Invalid column name 'key'.

@roji I was able to find tickets for similar issues, most slated to be fixed in v9.0, but I'm not sure this exact bug has been documented.

@roji
Copy link
Member

roji commented May 30, 2024

Note npgsql/efcore.pg#3188 (comment), where the claim is being made that constants perform better on PostgreSQL on a very simple query, with the presence of indexing. We should look into this.

This turned out to be a case of incorrect parameter typing because of a badly-configured model (user error).

@MichaCo
Copy link

MichaCo commented Jun 16, 2024

Hi @roji, I just got here after migrating our app from NET6 to NET8 including EF updates and running into performance issues.

I just wanted to provide another, relatively simple scenario to reproduce the issue. This is a simplified version of what I found in one of our queries, The real query is a bit more complex with an outer join over the Id and returning another value. That does not make much of a difference though.

(I'm no SQL pro, so please don't judge my SQL skills to produce test data ;) )

drop table if exists #Test
create table #Test(
	Id int  identity not null,
	OtherId int not null,
	RefA uniqueidentifier null,
	RefB uniqueidentifier null,
	RefC uniqueidentifier null
)

create unique clustered index IX_PK on #test (Id)

CREATE NONCLUSTERED INDEX IX_TestA ON #Test
(
	RefA ASC
)
CREATE NONCLUSTERED INDEX IX_TestB ON #Test
(
	RefB ASC
)
CREATE NONCLUSTERED INDEX IX_TestC ON #Test
(
	RefC ASC
)
CREATE NONCLUSTERED INDEX IX_Test ON #Test
(
	RefA ASC,
	RefB ASC,
	RefC ASC	
) include (OtherId, Id) 

declare @i int = 0

WHILE @i < 10000
BEGIN
    SET @i = @i + 1
	insert into #Test (OtherId, RefA, RefB, RefC) values 
		(10, 'de14de64-0e24-ef11-b1b8-00505684a3fb', null, null)
		,(11, null, '8115de64-0e24-ef11-b1b8-00505684a3fb', null)
		,(12, null, '7f15de64-0e24-ef11-b1b8-00505684a3fb', null)
		,(13, null, null, null)
END

set statistics io, time on

declare @idsA nvarchar(4000) =N'["de14de64-0e24-ef11-b1b8-00505684a3fb","df14de64-0e24-ef11-b1b8-00505684a3fb","e014de64-0e24-ef11-b1b8-00505684a3fb","e114de64-0e24-ef11-b1b8-00505684a3fb","e214de64-0e24-ef11-b1b8-00505684a3fb","e314de64-0e24-ef11-b1b8-00505684a3fb","e414de64-0e24-ef11-b1b8-00505684a3fb","e514de64-0e24-ef11-b1b8-00505684a3fb","e614de64-0e24-ef11-b1b8-00505684a3fb","e714de64-0e24-ef11-b1b8-00505684a3fb","e814de64-0e24-ef11-b1b8-00505684a3fb","e914de64-0e24-ef11-b1b8-00505684a3fb","ea14de64-0e24-ef11-b1b8-00505684a3fb","eb14de64-0e24-ef11-b1b8-00505684a3fb","ec14de64-0e24-ef11-b1b8-00505684a3fb","ed14de64-0e24-ef11-b1b8-00505684a3fb","ee14de64-0e24-ef11-b1b8-00505684a3fb"]'
,@idsB nvarchar(4000) = N'["7f15de64-0e24-ef11-b1b8-00505684a3fb","8015de64-0e24-ef11-b1b8-00505684a3fb","8115de64-0e24-ef11-b1b8-00505684a3fb","8215de64-0e24-ef11-b1b8-00505684a3fb","8315de64-0e24-ef11-b1b8-00505684a3fb","8415de64-0e24-ef11-b1b8-00505684a3fb","8515de64-0e24-ef11-b1b8-00505684a3fb","8615de64-0e24-ef11-b1b8-00505684a3fb","8715de64-0e24-ef11-b1b8-00505684a3fb","8815de64-0e24-ef11-b1b8-00505684a3fb","8915de64-0e24-ef11-b1b8-00505684a3fb","8a15de64-0e24-ef11-b1b8-00505684a3fb","8b15de64-0e24-ef11-b1b8-00505684a3fb","8c15de64-0e24-ef11-b1b8-00505684a3fb","8d15de64-0e24-ef11-b1b8-00505684a3fb","8e15de64-0e24-ef11-b1b8-00505684a3fb","8f15de64-0e24-ef11-b1b8-00505684a3fb","9015de64-0e24-ef11-b1b8-00505684a3fb","9115de64-0e24-ef11-b1b8-00505684a3fb","9215de64-0e24-ef11-b1b8-00505684a3fb","9315de64-0e24-ef11-b1b8-00505684a3fb","9415de64-0e24-ef11-b1b8-00505684a3fb","9515de64-0e24-ef11-b1b8-00505684a3fb","9615de64-0e24-ef11-b1b8-00505684a3fb","9715de64-0e24-ef11-b1b8-00505684a3fb","9815de64-0e24-ef11-b1b8-00505684a3fb"]'
,@idsC nvarchar(4000) =N'["7f15de64-0e24-ef11-b1b8-00505684a3fb"]'

select count(OtherId) from #Test
where RefA in (
	SELECT [value]
	FROM OPENJSON(@idsA) WITH ([value] uniqueidentifier '$')
)
OR RefB in (
	SELECT [value]
	FROM OPENJSON(@idsB) WITH ([value] uniqueidentifier '$')
)
OR RefC in (
	SELECT [value]
	FROM OPENJSON(@idsC) WITH ([value] uniqueidentifier '$')
)

The "fun" thing I have noticed is that SQL Server chooses a bad query plan at some point. For this query it chooses a nested loop + filter to 3 OPENJSON blocks only if there are 3 optional where clauses, which is very slow:

image

If I remove one or two ORs, the query is blazing fast (no nested loop)
image

:update:
One thing I found during testing was that if you insert the result of each OPENJSON sub query into a temp table, and then use that temp table in the IN statement instead, the query works perfectly fine also!


Regarding solving the problem, I didn't read all the comments here but it sounded like as if I can only opt out by setting the SQL compat level to 120.
The per query level opt out is not really an option for us with ~20 different services and different databases doing 1000s of different queries...

Thanks,
Michael

@stevenxi
Copy link

stevenxi commented Jul 7, 2024

Hi @roji, I just got here after migrating our app from NET6 to NET8 including EF updates and running into performance issues.

I just wanted to provide another, relatively simple scenario to reproduce the issue. This is a simplified version of what I found in one of our queries, The real query is a bit more complex with an outer join over the Id and returning another value. That does not make much of a difference though.

(I'm no SQL pro, so please don't judge my SQL skills to produce test data ;) )

drop table if exists #Test
create table #Test(
	Id int  identity not null,
	OtherId int not null,
	RefA uniqueidentifier null,
	RefB uniqueidentifier null,
	RefC uniqueidentifier null
)

create unique clustered index IX_PK on #test (Id)

CREATE NONCLUSTERED INDEX IX_TestA ON #Test
(
	RefA ASC
)
CREATE NONCLUSTERED INDEX IX_TestB ON #Test
(
	RefB ASC
)
CREATE NONCLUSTERED INDEX IX_TestC ON #Test
(
	RefC ASC
)
CREATE NONCLUSTERED INDEX IX_Test ON #Test
(
	RefA ASC,
	RefB ASC,
	RefC ASC	
) include (OtherId, Id) 

declare @i int = 0

WHILE @i < 10000
BEGIN
    SET @i = @i + 1
	insert into #Test (OtherId, RefA, RefB, RefC) values 
		(10, 'de14de64-0e24-ef11-b1b8-00505684a3fb', null, null)
		,(11, null, '8115de64-0e24-ef11-b1b8-00505684a3fb', null)
		,(12, null, '7f15de64-0e24-ef11-b1b8-00505684a3fb', null)
		,(13, null, null, null)
END

set statistics io, time on

declare @idsA nvarchar(4000) =N'["de14de64-0e24-ef11-b1b8-00505684a3fb","df14de64-0e24-ef11-b1b8-00505684a3fb","e014de64-0e24-ef11-b1b8-00505684a3fb","e114de64-0e24-ef11-b1b8-00505684a3fb","e214de64-0e24-ef11-b1b8-00505684a3fb","e314de64-0e24-ef11-b1b8-00505684a3fb","e414de64-0e24-ef11-b1b8-00505684a3fb","e514de64-0e24-ef11-b1b8-00505684a3fb","e614de64-0e24-ef11-b1b8-00505684a3fb","e714de64-0e24-ef11-b1b8-00505684a3fb","e814de64-0e24-ef11-b1b8-00505684a3fb","e914de64-0e24-ef11-b1b8-00505684a3fb","ea14de64-0e24-ef11-b1b8-00505684a3fb","eb14de64-0e24-ef11-b1b8-00505684a3fb","ec14de64-0e24-ef11-b1b8-00505684a3fb","ed14de64-0e24-ef11-b1b8-00505684a3fb","ee14de64-0e24-ef11-b1b8-00505684a3fb"]'
,@idsB nvarchar(4000) = N'["7f15de64-0e24-ef11-b1b8-00505684a3fb","8015de64-0e24-ef11-b1b8-00505684a3fb","8115de64-0e24-ef11-b1b8-00505684a3fb","8215de64-0e24-ef11-b1b8-00505684a3fb","8315de64-0e24-ef11-b1b8-00505684a3fb","8415de64-0e24-ef11-b1b8-00505684a3fb","8515de64-0e24-ef11-b1b8-00505684a3fb","8615de64-0e24-ef11-b1b8-00505684a3fb","8715de64-0e24-ef11-b1b8-00505684a3fb","8815de64-0e24-ef11-b1b8-00505684a3fb","8915de64-0e24-ef11-b1b8-00505684a3fb","8a15de64-0e24-ef11-b1b8-00505684a3fb","8b15de64-0e24-ef11-b1b8-00505684a3fb","8c15de64-0e24-ef11-b1b8-00505684a3fb","8d15de64-0e24-ef11-b1b8-00505684a3fb","8e15de64-0e24-ef11-b1b8-00505684a3fb","8f15de64-0e24-ef11-b1b8-00505684a3fb","9015de64-0e24-ef11-b1b8-00505684a3fb","9115de64-0e24-ef11-b1b8-00505684a3fb","9215de64-0e24-ef11-b1b8-00505684a3fb","9315de64-0e24-ef11-b1b8-00505684a3fb","9415de64-0e24-ef11-b1b8-00505684a3fb","9515de64-0e24-ef11-b1b8-00505684a3fb","9615de64-0e24-ef11-b1b8-00505684a3fb","9715de64-0e24-ef11-b1b8-00505684a3fb","9815de64-0e24-ef11-b1b8-00505684a3fb"]'
,@idsC nvarchar(4000) =N'["7f15de64-0e24-ef11-b1b8-00505684a3fb"]'

select count(OtherId) from #Test
where RefA in (
	SELECT [value]
	FROM OPENJSON(@idsA) WITH ([value] uniqueidentifier '$')
)
OR RefB in (
	SELECT [value]
	FROM OPENJSON(@idsB) WITH ([value] uniqueidentifier '$')
)
OR RefC in (
	SELECT [value]
	FROM OPENJSON(@idsC) WITH ([value] uniqueidentifier '$')
)

The "fun" thing I have noticed is that SQL Server chooses a bad query plan at some point. For this query it chooses a nested loop + filter to 3 OPENJSON blocks only if there are 3 optional where clauses, which is very slow:

image

If I remove one or two ORs, the query is blazing fast (no nested loop) image

:update: One thing I found during testing was that if you insert the result of each OPENJSON sub query into a temp table, and then use that temp table in the IN statement instead, the query works perfectly fine also!

Regarding solving the problem, I didn't read all the comments here but it sounded like as if I can only opt out by setting the SQL compat level to 120. The per query level opt out is not really an option for us with ~20 different services and different databases doing 1000s of different queries...

Thanks, Michael

We've been using temp table to query "contains" for any collection contains more than 20 elements over quite many years.
There is still a catch, the index on the temp table is also relevant to the planning. But generally speaking, it helps a lot.

We've created a helper class to manage this situation, when elements less than 20, it uses the normal contains, otherwise it would translate the query into a "join".

Btw, we've also noticed the dramatic performance drop when upgrade to EF Core 8. This OPENJSON doesn't seem to be a good idea.

@MichaCo
Copy link

MichaCo commented Jul 7, 2024

@stevenxi I don't think using OPENJSON for contains is bad in general. On the contrary, I do see performance improvements over the old implementation in some queries.

The problem for me here is that just upgrading our code base to NET8 without changing any of our code, causes maybe 5% out of 1000s of different queries to fail (being too slow = fail/bug).
That's still 100-200 places where I would have to go and manually change the queries to not be slow.
The problem here is, there is no way for me to automate that or actually find all those places reliably? Many queries might only run under certain conditions and its basically impossible for me to test and validate every variation, especially with some more advanced or dynamic stuff...
And I'm not given the budget and time to do that manually..

That's why I'm hoping for a more general fix which does not involve making opt-in/out decisions on a per query level. That's not something a framework like EF Core should force on their consumers in my opinion @roji because of how much such a migration can cost.

So, are the ways to fix this without having to test and decide on a query by query level?

  • Changing the SQL compat level to 120 globally works but is like the worst "fix" for this.
    In many cases I see better performance using EF Core 8 vs. EF Core 6. But only if I do not change the compat level...
  • To be fair, for me this looks like a bug in SQL Server's query plan engine and should maybe fixed on that level?
    The slow query plans I've seen for these cases are so bad that I really don't get why SQL Server would even pick it...
  • Maybe there is a way to always convince SQL Server to use a proper query plan?
    • For example, forcing HASH MATCH joins does work. This does prevent the nested loop joins at least in my example. But it is really hard to tell if that's a good general choice because HASH MATCH has an overhead...
  • Creating temp tables out of the GEOJSON part before using it seems to be pretty solid.
    This actually doesn't have to be a #temp db table. It also works just fine with in-memory table variables.
  • Other ideas?

@roji is there any plan on handling this issue in the current or next version of EF Core out of the box, and if so, do you have some code or some ideas to share? I'm happy to help with testing for example.

@roji
Copy link
Member

roji commented Jul 8, 2024

@MichaCo I know this issue is long, but your questions have already been addressed above.

To summarize:

Changing the SQL compat level to 120 globally works but is like the worst "fix" for this.

We will probably be introducing a more specific global option to control this specific thing.

To be fair, for me this looks like a bug in SQL Server's query plan engine and should maybe fixed on that level?

This isn't a bug, but rather a real difficulty in planning; different array values have different optimal plans, and when EF integrates constant values into the query (as in pre-8.0), SQL Server can basically plan very specifically for that SQL with that specific number of parameters. Whereas with OPENJSON the array is parameterized, and SQL Server is unable to do the same. The SQL Server people are aware and there are some improvements possibly planned, but the basic problem here is real and cannot be fully solved.

For example, forcing HASH MATCH joins does work. This does prevent the nested loop joins at least in my example. But it is really hard to tell if that's a good general choice because HASH MATCH has an overhead...

Forcing plans with hints is very data-specific and brittle; it may work given you have a specific amount of data in your table (or are using a parameterized array of a certain size), but not in other cases. It's not possible to come up with a general thing that will work optimally in all cases.

Creating temp tables out of the GEOJSON part before using it seems to be pretty solid.

Creating temp tables for each and every query can be very bad for perf in its own way. Temp tables work well for a single big operation, but I don't think it makes sense to create one on a per-query basis. I have specifically measured using in-memory table variables, but I still suspect that the overhead of creating and then destroying them will make this option unfeasible.

@MichaCo
Copy link

MichaCo commented Jul 8, 2024

@roji Sorry for repeating some questions. I was just trying to emphasis how bad it might be to have to opt out globally (or query per query).

At least in my opinion both those options are not great:
I do like the change of using OPENJSON, I do see a lot of performance gains in many queries.
A global opt-out is safer but doesn't improve my apps, obviously.
A per query opt-in/out forces me to spend a considerable amount of time of re-factoring, which is a big ask.

TLDR: I was just hoping we could find a way where EF could make that decision automatically, and could choose the best implementation based on certain conditions.
But sounds like this is not possible 😢

Maybe we just need some guidance or tools for making informed query by query decisions.
To do so, we'd have to measure performance for each query using 2-3 different variants of SQL generation.
Is there an easy way to do that with EF today, like running the same query n times with different variants and record the results? This could be a unit/integration testing thing only, because you probably don't want to do that in production.
(I did look into the LogTo stuff, that might be helpful for recording results but that's not really enough)

Anyways, thanks for the detailed reply above and for listening ;)

@PaulVrugt
Copy link

@roji
TLDR: I was just hoping we could find a way where EF could make that decision automatically, and could choose the best implementation based on certain conditions. But sounds like this is not possible 😢

I completely agree. Neither of the options are a good idea.

  • A global switch simply reverts back to the old "not optimal" situation of having to create a query plan per query
  • A "per query" switch forces us to make some kind of decision per query on whether or not to use the OPENJSON variant, but based on what data? How on earth are we supposed to make an informed decision PER QUERY on whether or not to use OPENJSON

I don't know what the solution is, but there should be some of automatic process of determining the correct execution mode

@roji
Copy link
Member

roji commented Jul 8, 2024

TLDR: I was just hoping we could find a way where EF could make that decision automatically, and could choose the best implementation based on certain conditions.
[...]
I don't know what the solution is, but there should be some of automatic process of determining the correct execution mode

Unfortunately, there are some situations where there is no single good answer, and there are also limits to what can be expected of an ORM. We are very unlikely to start building in logic to make EF automatically try to figure out which translation is better for a given LINQ query - at that point we're starting to implement database query planning-like functionality ourselves. That is just not within the scope of EF.

The above comments concentrate on how it's impossible to go over each and every query in the program, and to gauge which option to use. I'd strongly suggest identifying the critical, hot queries, and possibly doing the analysis on those; these are typically relatively few queries in regular applications, and spending time on queries that run rarely or where the OPENJSON question doesn't have any meaningful impact simply doesn't make sense.

@MelGrubb
Copy link

MelGrubb commented Jul 8, 2024

I'd be happy with a global switch and a way to opt in/out on a per-query basis. If the new behavior works for most of your queries, then opt in by default and opt out the problematic queries. If you want to play it safe, then opt out by default and only opt in your queries once you can test and verify them individually. Just give me the controls.

@julienFlexsoft
Copy link

I agree that a global switch and a per-query opt-in/out is the way to go. It offers full control to the developer. I could for example build my own extension that would re-try the query with the OPENJSON turned off if it fails with a time-out.
But for doing it automatically with EF Core maybe we can isolate scenario's where OPENJSON would always be a worse option. I thought that the performance was especially bad with nullable NVARCHAR(MAX) columns. I'm not 100% sure about this, but that might be a way to offer some kind of automatic tuning.

@roji
Copy link
Member

roji commented Jul 9, 2024

@julienFlexsoft I'm not aware of cases where we can know in advance that OPENJSON is always worse (not sure exactly what kind of query you're referring to with nullable NVARCHAR(MAX)).

@superjulius
Copy link

I agree with @MelGrubb that a combination of a global switch and a per-query opt-in/out sounds the right way of doing it. It gives full control to the developers for their scenarios which will be always different by nature.

This will move the responsibility out of our beloved ORM to us, knowing that it won't be able to guess what is best as @roji well explained, and then we will only be able to complain about us not doing the right choice :-)

@devmanlab
Copy link

Having the capability to control it would be perfect with per query opt-in/out. I will prohibit global switch in my team, if it is introduced for this purpose - IMHO it is awful approach for this problem.
Even if later performance for such queries is resolved, the ability to control it on a per-query basis will still be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests