05/01/2023, 9:34 PM
What happens in Prefect when you restart a failed run? e.g. let’s say I had retry=3 but all retries failed because the database was down and the last task in the flow needed to ingest data into the database. One hour later the database comes up and I manually restart a failed run. Does the restarted run know it’s yet another an incarnation of a previously failed run, or is it stateless and think it’s a different run? I guess my question is whether Prefect re-runs are stateful or stateless and what are the re-try semantics? Also what are the best practices around idempotency? I want an all-or-nothing semantics with the final results written to database, no partial leakage. So the question also becomes whether or not re-runs that wrote no data are considered successful. Update: I saw this post from @Anna M Geller I think there is a hidden “Data Engineering 101” assumption that: 1. Tasks and DAGs as a whole should be idempotent 2. Use timestamps as means to achieving idempotent tasks - e.g. max(LAST_UPDATED), this is the flag that will indicate whether the task already succeeded, so no side effects 3. Implement atomic all-or-nothing semantics within each task a. For example - no partial writes to database tables