Any recommendations on the best way to handle statefulness between flow runs and tasks.
I’m looking to make a flow which executes many steps fault tolerant in case a particular task and/or step fails.
Tools such as temporal handle this well, but requires a lot of heavy lifting to implement. I’m wondering if there is a best practice approach available within Prefect.
Use Case:
I have a pipeline which contains many tasks that updates a Database record as it progresses through the flow.
A failure during this workflow would leave a record in a potentially broken state where a status has been updated, but has not progressed to a subsequent step.
A re-run of this flow without proper statefulness could potentially render this record as broken forever without manual intervention to reset the record.
d
Dominic Tarro
03/29/2023, 4:05 PM
1. Is not updating the record each step of the way an option?
2. Use a staging table with a column that is only set when all tasks for it succeed. If the record was successfully handled, this will be marked, and you can update your destination table.
d
datamongus
03/29/2023, 4:07 PM
Very true this could be an option and likely will be the option I select, but I was curious if there was state management built into prefect itself for more lower level tasks.
datamongus
03/29/2023, 4:07 PM
As in something I can depend on outside of the data warehouse
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.