David Kuda
01/27/2021, 9:46 PM1) create table a
, 2) create table b
and 3) join data from tables a and b
. We want to use the dask executor, because there are numerous more tasks. What is the recommended way to have task 1
and task 2
run simultaneously, but have task 3
wait for tasks 1 and 2? The dask executor would create a DAG where it would run tasks 1, 2 and 3 together, but that will fail, since task 3 is depending on tasks 1 and 2. The Python code however, has no dependency.
I have few ideas, and I am excited to see your opinion on that question. I hope that I could express myself clearly. Best regards from Berlin, David.josh
01/27/2021, 9:49 PMIn [1]: with Flow("my_flow") as flow:
...: t1 = my_task1()
...: t2 = my_task2()
...: t3 = my_task3()
...:
...: t3.set_upstream(t1)
...: t3.set_upstream(t2)
David Kuda
01/27/2021, 9:51 PMZanie
01/27/2021, 9:56 PMDavid Kuda
01/27/2021, 10:00 PMmerlin
01/28/2021, 1:36 AM