https://prefect.io logo
Title
j

Jehan Abduljabbar

10/13/2022, 12:41 PM
Hello, If I have a flow with DaskTaskRunner, but some tasks have parameter dependencies between them, and some other tasks have specified dependency using "wait_for", so essentially the tasks in the flow can be divided into a group of 3, and each group can run in parallel. DaskTaskRunner recognizes that right? I just want to make sure if having any dependency in there affects parallelism even for tasks that could run in parallel? I didn't want to separate them into 3 subflows because the logic is better as one flow, but wondering if I have to.
1
k

Khuyen Tran

10/13/2022, 2:03 PM
Can you provide a minimal example of your use case?
j

Jehan Abduljabbar

10/13/2022, 2:17 PM
@flow def my_flow(task_runner=DaskTaskRunner): val1 = function1.submit().result() val2 = function2.submit().result() task1data = task1.submit(val1) task2data = task2.submit(val1) task3data = task3.submit(val1) task4data = task4.submit(val1) task5data = task5.submit(val1) task6data = task6.submit(val1) task7data = task7.submit(val1) task8.submit(val2, task1data) task9.submit(val2, task2data) task10.submit(val2, task3data) task11.submit(val2, task4data, wait_for=[task8] task12.submit(val2, task5data, wait_for=[task9] task13.submit(val2, task6data, waif_for=[task8, task9] task14.submit(val2, task7data, wair_for=[task8, task10]
So what I am anticipating is, the 2 functions run sequentially, then the first 7 tasks run in parallel, then the 3 tasks after them run in parallel, then the last 4 tasks run in parallel.
k

Khuyen Tran

10/13/2022, 3:26 PM
It looks like it will behave as you describe. But you can double check with the logs to make sure it works as expected
j

Jehan Abduljabbar

10/13/2022, 3:40 PM
Sounds good, thanks!