, but tasks have a limitation: they can’t call other tasks.
In a complex/nested job, for every function except the top-level flow I have to think about whether it is atomic and can be a task, or whether I will want to split it out further into multiple tasks, in which case it needs to be a flow.
But I realized I could just make everything a flow so I don’t have to worry about it. If I decide to split it out I can. If I don’t, no problem. Given the power and flexibility of flows, and that one arbitrary limitation of tasks, why use tasks at all?
So, what am I missing here? Are tasks obsolete? What’s their unique use case?
✅ 1
r
redsquare
10/18/2022, 4:14 PM
Retries?
z
Zanie
10/18/2022, 4:16 PM
This is indeed a very interesting change from v1 -> v2 and something we’ve talked about internally. The biggest difference right now is that tasks are be submitted to a runner which allows them to easily run concurrently or on distributed workers. Tasks cannot call other tasks because the flow manages those workers. While on a distributed worker, we cannot manage submission and tracking of another, new task run. In contrast, running a flow from a flow is a blocking operation (unless using
run_deployment
).
👍 2
🙏 2
Zanie
10/18/2022, 4:17 PM
@redsquare There are retries on both flows and tasks.
a
Anna Geller
10/18/2022, 4:23 PM
+1 to Michael and also another aspect to consider is observability/visibility: flow and task runs are displayed and tracked differently in the backend and UI
also things like caching work only on a task level atm
k
Khuyen Tran
10/18/2022, 4:31 PM
I created
a video▾
to explain when to use tasks, when to use flows
j
Jason Thomas
10/18/2022, 4:50 PM
Thanks all, that helps. Thanks @Zanie for the details, and @Anna Geller for the note about caching.
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.