Replies: 1 comment 2 replies
-
Having considered this further, one issue with my suggestion above is how to handle duplicate tags when multiple assets are materialised in a single run. A simpler approach could be to add an optional This could look as follows: from dagster import asset, BackfillPolicy, PartitionPriority, DailyPartitionsDefinition
@asset(
partitions_def=DailyPartitionsDefinition(start_date="2024-01-01"),
backfill_policy=BackfillPolicy(partition_priority=PartitionPriority.REVERSE),
)
def my_asset_with_backfill_priority():
"""
Example asset with a 'REVERSE' partition priority.
Recent partitions will be given the highest priority
""" |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The Problem
We have daily partitioned assets that are materialized using automation policies. The consumers of these assets want upstream changes to be propagated to the asset within a few minutes for recent partitions. They care mostly about the freshness of the latest partition.
There are regular "update" events throughout the day that target the latest partition. Occasionally we receive bulk "restatement" events that target a large number of partitions, which triggers a backfill. When this occurs the latest partition can get queued up behind historic partitions and therefore breach its freshness SLA.
Proposed Solution
One idea is to leverage the existing run queue prioritization functionality. In order for this to work, we would need a mechanism to dynamically tag the individual partition runs at "backfill launch time". The runs would be tagged with the
dagster/priority
value based on the partition age.A generalisation of this idea would be to allow the user to define a "tag generating function" for the asset that gets called during run submission. The api for this could look as follows:
Alternative approaches
We have come up with a workaround to this problem internally. We split the problem into "recent" partitions (<1 month old) which are handled through Automation policy. Historic partitions are handled with a scheduled backfill. There is some intelligence to scheduled backfill as it will only target stale partitions. Nonetheless this feels like a hack. We would prefer to be able to handle this entirely with Declaritive Automation.
Beta Was this translation helpful? Give feedback.
All reactions