-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up PrefectDBInterface, use the models consistently #16392
Clean up PrefectDBInterface, use the models consistently #16392
Conversation
CodSpeed Performance ReportMerging #16392 will not alter performanceComparing Summary
|
5a477e6
to
bef28bd
Compare
bef28bd
to
dbc0c40
Compare
dbc0c40
to
537a048
Compare
35a7a13
to
382d5d9
Compare
Type completeness increased by 1.83% |
bc9fe02
to
17f8d7f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but I want to make sure @zzstoatzz also gets a chance to look to ensure I'm not missing anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i haven't looked deeply enough yet to know which of the changes caused this, but i am seeing 500s from the graph v2 endpoint in the UI (flow run timeline) and the flow run timeline does not render for flow runs
File "/Users/nate/github.com/prefecthq/prefect/src/prefect/server/utilities/database.py", line 128, in process_bind_param
raise ValueError("Timestamps must have a timezone.")
sqlalchemy.exc.StatementError: (builtins.ValueError) Timestamps must have a timezone.
[SQL: WITH edges AS
(SELECT CASE WHEN (flow_run.id IS NOT NULL) THEN :param_1 ELSE :param_2 END AS kind, coalesce(flow_run.id, task_run.id) AS id, coalesce(flow.name || :name_1 || flow_run.name, task_run.name) AS label, coalesce(flow_run.state_type, task_run.state_type) AS state_type, coalesce(flow_run.start_time, flow_run.expected_start_time, task_run.start_time, task_run.expected_start_time) AS start_time, coalesce(flow_run.end_time, task_run.end_time, CASE WHEN (task_run.state_type = :state_type_1) THEN task_run.expected_start_time ELSE NULL END) AS end_time, JSON_EXTRACT(argument.value, :value_1) AS parent, input."key" = :key_1 AS has_encapsulating_task
FROM task_run LEFT OUTER JOIN json_each(task_run.task_inputs) AS input ON 1 = 1 LEFT OUTER JOIN json_each(input.value) AS argument ON 1 = 1 LEFT OUTER JOIN flow_run ON flow_run.parent_task_run_id = task_run.id LEFT OUTER JOIN flow ON flow.id = flow_run.flow_id
WHERE task_run.flow_run_id = :flow_run_id AND task_run.state_type != :state_type_2 AND coalesce(flow_run.start_time, flow_run.expected_start_time, task_run.start_time, task_run.expected_start_time) IS NOT NULL ORDER BY coalesce(flow_run.id, task_run.id)),
with_parents AS
(SELECT children.id AS id, json_group_array(parents.id) AS parent_ids
FROM edges AS children JOIN edges AS parents ON parents.id = children.parent
WHERE children.has_encapsulating_task IS NOT 1 GROUP BY children.id),
with_children AS
(SELECT parents.id AS id, json_group_array(children.id) AS child_ids
FROM edges AS parents JOIN edges AS children ON children.parent = parents.id
WHERE children.has_encapsulating_task IS NOT 1 GROUP BY parents.id),
with_encapsulating AS
(SELECT children.id AS id, json_group_array(parents.id) AS encapsulating_ids
FROM edges AS children JOIN edges AS parents ON parents.id = children.parent
WHERE children.has_encapsulating_task IS 1 GROUP BY children.id),
nodes AS
(SELECT DISTINCT edges.kind AS kind, edges.id AS id, edges.label AS label, edges.state_type AS state_type, edges.start_time AS start_time, edges.end_time AS end_time, with_parents.parent_ids AS parent_ids, with_children.child_ids AS child_ids, with_encapsulating.encapsulating_ids AS encapsulating_ids
FROM edges LEFT OUTER JOIN with_parents ON with_parents.id = edges.id LEFT OUTER JOIN with_children ON with_children.id = edges.id LEFT OUTER JOIN with_encapsulating ON with_encapsulating.id = edges.id)
SELECT nodes.kind, nodes.id, nodes.label, nodes.state_type, nodes.start_time, nodes.end_time, nodes.parent_ids AS parent_ids, nodes.child_ids AS child_ids, nodes.encapsulating_ids AS encapsulating_ids
FROM nodes
WHERE nodes.end_time IS NULL OR nodes.end_time >= :since ORDER BY nodes.start_time, nodes.end_time
LIMIT :max_nodes OFFSET :param_3]
[parameters: [{'flow_run_id': UUID('a51a080f-c046-464f-bc7b-ccb0289ea7b8'), 'since': datetime.datetime(1, 1, 1, 0, 0), 'max_nodes': 10001}]]
at a glance, looks like some timestamp issue
File "/Users/nate/github.com/prefecthq/prefect/src/prefect/server/utilities/database.py", line 128, in process_bind_param
raise ValueError("Timestamps must have a timezone.")
sqlalchemy.exc.StatementError: (builtins.ValueError) Timestamps must have a timezone.
i do not observe this on main
I'm having trouble reproducing this; I tried running a flow with my local server and looked at the Runs section for the flow, and these all work. This was using PostgreSQL however, and your included SQL query suggests this was with SQLite. However, both codepaths use the same |
thanks for the update @mjpieters!
this makes sense to me. looks like we have a couple remaining dt serialization issues, but otherwise this is looking good to me - thanks so much for all the improvements! |
c8942ac
to
8b4aaf0
Compare
@zzstoatzz You didn't mark the PR as approved (it still says that there are required changes), but you didn't make any comments about what should be changed. Was that an oversight somewhere or is there something I still need to fix? |
f00300d
to
d90d102
Compare
The 3 test failures here are due to #16440 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @mjpieters - apologies for the slight delay in turn around
You didn't mark the PR as approved (it still says that there are required changes), but you didn't make any comments about what should be changed.
at the time of commenting the following:
looks like we have a couple remaining dt serialization issues
I was seeing test failures related to datetime serialization that look to now be resolved, so I intentionally did not approve at that time
your current failures should be unrelated and resolved by #16433
7f1a833
to
27b7007
Compare
79de108
to
db65ad9
Compare
064e604
to
1a962b6
Compare
1a962b6
to
1217948
Compare
Related: #16292