Hey all, we have some flows that we run on daily ...
# ask-community
r
Hey all, we have some flows that we run on daily basis to create a report for the previous day. Therefore, we currently derive the previous date within the flow unless it is provided as input parameter. So, if the flow runs on day D1, it will create the report for D1-1. The problem with this is that if we want to retrigger the flow at a later day (D2), the flow is not idempotent and doesn't rerun for the initial date D1-1, but for D2-1 instead. A workaround would be to define a "trigger deployment" that derives the date and then triggers the "execution deployment", such that the actual deployment run has the appropriate date parameter and can thus be retriggered any day. In that case, I guess it would make sense that the "trigger deployment" always succeeds if the second deployment was triggered successfully. The "execution deployment" would only succeed, if it successfully runs ... Is there a (more) elegant way to schedule a flow and parametrize it to use the previous day by default, without the need for the "trigger deployment"?
Maybe @Marvin can help?
m
Of course, I'm here to help. Could you please provide more details about what you need assistance with regarding Prefect?
r
@Marvin, how to elegantly create a scheduled deployment that infers date from current datetime but also allows to retrigger the flow without re-inferring the datetime? For example, if a flow that runs on Tuesday to evaluate Monday is retriggered on Wendesday, it should again run for Monday and not for Tuesday.
m
To create a scheduled deployment in Prefect that infers a date from the current datetime but allows retriggering with a fixed date, you can follow these steps: 1. Define the Flow Logic: Within your flow, infer the date only if it hasn't been set yet. Use a parameter to allow setting the date externally when retriggering. 2. Create a Deployment with a Schedule: Use Prefect's scheduling capabilities to define when your flow should run. You can use cron, interval, or RRule schedules depending on your needs. 3. Parameterize the Date: When defining the flow, include a parameter for the date. This way, when you retrigger the flow, you can specify the date you want it to run for. Here's a conceptual example:
Copy code
python
from prefect import flow, task
from datetime import datetime, timedelta

@task
def evaluate_data(for_date):
    # Logic to evaluate data for the given date
    pass

@flow
def my_flow(for_date: datetime = None):
    if not for_date:
        # Infer the date as the previous day if not provided
        for_date = datetime.now() - timedelta(days=1)
    evaluate_data(for_date)

# Deployment and scheduling logic
# You can define this in a YAML file or using Prefect CLI
Links for Further Reading: - Prefect Scheduling - Custom Triggers in Prefect This approach allows you to have a default behavior for date inference while retaining flexibility for retriggering with a fixed date.
r
That's how we did it, but it requires to manually create a new deployment run. One can not simply "Retry" a failed deployment run because that may change the date, as it would be wrongly inferred
k
you could use caching to store the calculated date, and include the flow run id as part of the cache key computation. when you retry that flow run, it can fetch the cached date in part based on the fact that it was calculated in that particular flow run
r
Good point! So, the same caching file would have different content based on whether the flow succeeds or fails which could then be used to either rerun with the given date or not rerun at all when the flow ran successfully?
k
the cached data would be the return value from a task-decorated function, so you might need to check other properties of the flow run like its run count or state history for that level of control
r
OK, sounds more complicated then the current workaround. Thanks for the support!