Add robust dry run capability for backfill #44395
Labels
area:API
Airflow's REST/HTTP API
area:backfill
Specifically for backfill related
kind:feature
Feature Requests
kind:meta
High-level information important to the community
Body
Child of parent issue #43970
As a user, you want to be able to dry run the backfill creation process from the UI. E.g. i click "create backfill" and give it a range, then I want, in the UI, to be able to see the runs that will be created if I click "submit".
In order to do this, we'll have to refactor the backfill creation process a bit. Right now, we just submit a range, and the backfill endpoint will just create the backfill object and all of the runs.
One of the problems with the idea of implementing dry run is, suppose we return "these runs will be created; proceed?". Well what if the scheduler schedules, or a user clears or deletes, a run in the range. Then we would not end up doing exactly what we said we were going to do.
So what we need to do is somehow, implement in the API the ability to get some representation of the entirety of the backfill -- the object and its runs -- and then the user could submit that back to another endpoint which would just receive this payload and attempt to create it. In this second endpoint which is essentially "take the payload and create", we wolud first lock the dag and then attempt to insert all the rows. And if we find a conflict, we should abandon the whole try and tell the user, sorry, something changed, we got a conflict, please try again. There's a 409 Conflict API response that would seem to be appropriate here.
cc @phanikumv @jedcunningham @bbovenzi @pierrejeambrun
Committer
The text was updated successfully, but these errors were encountered: