Hey guys, I know there is a Task retry logic that ...
# ask-community
b
Hey guys, I know there is a Task retry logic that you can add ( on a failure ) but is there a Flow level retry?
k
We don’t have a flow level retry. I think in Prefect it would be starting a new flow run or using the GraphQL API to change the state of the tasks to
Scheduled
. Maybe coupled with a counter in the KV store you could pull it off. The simple hack is using a retry around a
StartFlowRun
or
create_flow_run
but that would create different flow runs. Why do you want flow level retries? When you say that, do you mean re-running even tasks that were successful?
b
nah, I am having failures of flows with race conditions when using the LocalDask executor. The race conditions are irregular ( about 2% of the time ) and simply fixed with a re run. The speed benefits I get from the Executor are worth having the small errors, but having a retry would be enormously helpful.
and different tasks fail on different runs, so a retry wouldnt really suit
unless I added it to every task, which feels wrong
k
Gotcha, what causes the race condition? Is it related to IO? Yeah adding to every task feels like the least invasive solution here.
Are you using or processes or threads btw?
b
I'm not sure what you mean. I just get strange errors and from googling the errors it appears as though it's something to do with boto3 running on multiple threads..
a
@Ben Muller are you using boto3.resource in your code? according to boto3 docs, resources are not thread-safe, so it could be that your problem can be solved by switching to boto3.client. You can find more about it here and here
👍 1
b
Thanks for the suggestion @Anna Geller (old account) but I'm using the boto3 client!
k
The
LocalDaskExecutor
takes in an argument for the scheduler whether to use processes or threads. Was wondering if you were using processes already cuz that might help?
b
Ah, I'm using threads. I will change it and see if that helps