-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow performance on string contains expression. #3188
Comments
Link to source code. |
Can you please post an actual minimal example showing the SQL above getting generated, and say which version you're using? EF doesn't generate that SQL for the LINQ you provided; given the minimal code sample below, I'm getting the following SELECT b."Id", b."Name"
FROM "Blogs" AS b
WHERE b."Name" = ANY (@__names_0) You can play around with the code sample just below to make it produce the problematic SQL, and then post that. Minimal code sampleawait using var context = new BlogContext();
await context.Database.EnsureDeletedAsync();
await context.Database.EnsureCreatedAsync();
string[] names = ["Blog1", "Blog2", "Blog3"];
_ = await context.Blogs.Where(b => names.Contains(b.Name)).ToListAsync();
public class BlogContext : DbContext
{
public DbSet<Blog> Blogs { get; set; }
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
=> optionsBuilder
.UseNpgsql("Host=localhost;Username=test;Password=test")
.LogTo(Console.WriteLine, LogLevel.Information)
.EnableSensitiveDataLogging();
}
public class Blog
{
public int Id { get; set; }
public string Name { get; set; }
} |
@roji Yeah. Your sample is quite enough for the isssue. Can you check the value of @__names_0 parameter? |
@kingnguyen93 there seems to be a bit of confusion here. The parameter cannot be What I think you're trying to say, is that when there's no parameter at all - i.e. EF generates SQL with the array contents as constants in the SQL ( All this has been discussed very extensively for SQL Server, please see #32394. It's true that I haven't focused on how PostgreSQL behaves here, but it's possible that it may be better to integrates constants in some cases. I'll make a note to investigate this. |
@roji Yep, you can take a look at 2 images above and line number 5. I just change condition from |
@kingnguyen93 can you please help provide a minimal repro for this? For a trivial query it does not seem to repro - please see the below console program which creates and seeds a database, and then prints the query plans for both query types. In both cases I'm getting:
Note that I'm not saying there isn't a problem - it just doesn't repro for this trivial cases (making the query more complex could trigger it - that's what I need help with). await using var dataSource = NpgsqlDataSource.Create("Host=localhost;Username=test;Password=test");
await using var conn = await dataSource.OpenConnectionAsync();
{
await using var command = new NpgsqlCommand(
"""
DROP TABLE IF EXISTS numbers;
CREATE TABLE numbers AS
SELECT x FROM generate_series(1,1000000) x;
CREATE INDEX IX_x on numbers(x);
ANALYZE numbers;
""", conn);
await command.ExecuteNonQueryAsync();
}
// Constants in ANY
{
await using var command = new NpgsqlCommand("EXPLAIN SELECT * FROM numbers WHERE x = ANY('{1,3}')", conn);
var plan = (string)(await command.ExecuteScalarAsync())!;
Console.WriteLine(plan);
}
// Parameter in ANY
{
await using var command = new NpgsqlCommand("EXPLAIN SELECT * FROM numbers WHERE x = ANY($1)", conn);
command.Parameters.Add(new() { Value = new[] { 1, 3 } });
var plan = (string)(await command.ExecuteScalarAsync())!;
Console.WriteLine(plan);
} |
@roji Here's my table. CREATE TABLE "public"."crm_calllogs" (
"stt_rec" "public"."ud_stt_rec" COLLATE "pg_catalog"."default" NOT NULL,
"stt_rec_doitac" "public"."ud_stt_rec" COLLATE "pg_catalog"."default" NOT NULL,
"call_id" "public"."ud_char50" COLLATE "pg_catalog"."default",
"dien_giai" "public"."ud_memo" COLLATE "pg_catalog"."default",
"ma_nvbh" "public"."ud_ma" COLLATE "pg_catalog"."default" NOT NULL,
"ma_nhom" "public"."ud_ma" COLLATE "pg_catalog"."default" NOT NULL,
"ma_dvcs" "public"."ud_ma" COLLATE "pg_catalog"."default",
"date0" "public"."ud_ngay" NOT NULL,
"time0" "public"."ud_time" NOT NULL,
"user_id0" "public"."ud_smallint" NOT NULL,
"status" "public"."ud_status" COLLATE "pg_catalog"."default" NOT NULL,
"date2" "public"."ud_ngay",
"time2" "public"."ud_time",
"user_id2" "public"."ud_smallint",
"ketnoi_yn" "public"."ud_smallint" NOT NULL
);
ALTER TABLE "public"."crm_calllogs" ADD CONSTRAINT "crm_calllogs_pkey" PRIMARY KEY ("stt_rec");
CREATE INDEX "crm_calllogs_date0_time0_idx" ON "public"."crm_calllogs" USING btree (
"date0" "pg_catalog"."date_ops" DESC NULLS FIRST,
"time0" "pg_catalog"."time_ops" DESC NULLS FIRST
);
CREATE INDEX "crm_calllogs_stt_rec_doitac_idx" ON "public"."crm_calllogs" USING btree (
"stt_rec_doitac" COLLATE "pg_catalog"."default" "pg_catalog"."bpchar_ops" ASC NULLS LAST
); I noiticed that something relate to index scan and data type when using ARRAY. ud_stt_rec is |
@roji I think the problem is that SQL generated from Entity Framework is using |
@kingnguyen93 if that's true, then you should be able to force |
@roji I already tested that. I think the root issue is at PG engine when convert |
But that's why I suggested adding
Note that there's no string parameter anywhere here (at least for now). What I'm trying to figure out here - with your help - is what is the exact source of the plan difference. If this is a If it really is a question of ANY with constants inside vs. ANY with an array inside, then this would be the same as dotnet/efcore#32394 which I linked to above (I suspect that's the case). |
@roji Here's my test results:
|
Can you please run |
@NinoFloris Here. Execution Time: 0.044 ms |
OK, so this is just a case of the array type ( We now need to understand why EFCore.PG sends an array with the wrong type. To know this, I need to see how you're configuring your EF model - can you start by confirming that you're actually configuring |
@roji I don't config it. Because I develop on existed database. And it took time to cofig that. :))))) |
At the protocol level we have to send a type for a parameter, it's likely that we end up sending text[] there |
Then that's very likely the source of your problem. Even if you use EF against an existing database, if you don't tell EF what the column types are, it can't generate the right SQL and/or parameters there. As @NinoFloris wrote, when a parameter is sent to database, it's strongly-typed, and it's important for that type to be correct. Since EF isn't properly configured here, it simply assumes that the type is a PG Try configuring the property as |
@roji I try to config data type. But I see that parameter sent to command is not strong type. |
That's just EF's logging which is unaware of the Npgsql-specific typing; that's set on NpgsqlParameter.NpgsqlDbType, which EF's logging mechanism doesn't know. Try running the code and seeing how it performs. To more reliably know what's getting sent, you can enable Npgsql's logging at the lower ADO.NET level, or use Wireshark to look at the packets. |
@roji I don't think so. I still got slow even if I config column type. I also try [Column(TypeName = "ud_stt_rec")]. And it's still the same. |
@roji I found the root cause. It's not related to data type. I have to config HasMaxLength(). I think if we don't set it. It will use text as default for string. |
It sounds like you're asking how to set the default max length for all string properties - see these docs. Am going to close this issue as the problem has been found and fixed, but feel free to continue posting if you have further questions. |
See details below. I have table with milion of records;
Here is generated SQL. It's using = ANY (ARRAY['string']).
It tooks 0.3s to execute. But I change to this. It's using = ANY ('{ string }').
It tooks only 0.02s to execute.
I'm using PostgreSQL 12 on Windows Server 2019.
The text was updated successfully, but these errors were encountered: