Hi, I'm able to set up a flow and orchestrate it w...
# best-practices
Hi, I'm able to set up a flow and orchestrate it with a deployment spec. Now I'd like to know what the best solution is for running a backfill set of runs. Could you give some pointers?
For backfilling I would trigger such flow runs locally by specifying the date periods using flow parameters:
Copy code
import pendulum
from prefect import task, flow, get_run_logger

def extract_and_load_data(start_date: str, end_date: str):
    # your extract logic based on those dates
    logger = get_run_logger()
    <http://logger.info|logger.info>(f"Backloading data for the interval {start_date} - {end_date}")

def ingest_and_backfill(
    extract_and_load_data(start_date=start_date, end_date=end_date)

if __name__ == "__main__":
👀 1
Is that considered best practice as I understand other solution have this functionality via a gui?
Creating a parametrized run through UI will be supported, it's on the roadmap
but backfilling through script locally seems easier, especially given that even local runs are tracked in the UI in Prefect 2.0
Reason being that in our case we don't like manual activities on our data pipeline boxes and most processes need to be abstracted away to avoid these manual activities. But I also prefer the manual solution as it'll be one off anyways.
Any backfill is a manually triggered run, what am I missing? not sure I understand the difference In 1.0 the difference was that a run triggered from a local script wouldn't be reflected in the UI and wouldn't be auditable, but this is no longer the case with 2.0 where the API is omnipresent even with local runs - all runs are auditable and reflected in the run history in the UI, regardless which process triggered it
🚀 1
Triggering flow runs locally (or on the server, if feasible) with date ranges is how my team does it, but we're using pre-1.0 so yes the lack of a log of the run in the UI isn't great.
👍 1
The difference is that you need to be able to run scripts (that could be unaudited scripts as well and this is a concern) on a box manually vs the run is triggered by a system with settings applied.
I'm curious, what would you consider an unaudited script? I believe if someone is able to talk to your API, it's better be that those people/processes are the ones who are authorized to do so - e.g. only users and processes with a valid API_KEY can talk to the Cloud 2.0 API so the default state of Prefect 2.0 is that only audited people and processes could run your backfill scripts
For most of our prod scripts we require some form of review and then it's fine. It's not really about the people (who may or may not have rights) but more about having some form of collab that results in better scripts. Generally I believe that most infra internally is open to most people.
exactly, I think this is more a people problem than technology problem, well put
🙂 1