I am having trouble setting task dependencies wher...
# ask-community
b
I am having trouble setting task dependencies where there is not a direct input from one task to the next.
Here is my flow (slightly obfuscated):
Copy code
with Flow("My Flow",) as flow:
    parameter = get_parameter()
    objects = get_objects()
    transformed = transform_objects(objects)
    write_data_to_s3(parameter, transformed)
    copy_data_to_snowflake(parameter, upstream_tasks=[write_data_to_s3])
What I want to happen is to
write_data_to_s3
, and then
copy_data_to_snowflake
after
write_data_to_s3
is finished. However, this code creates a second copy of
write_data_to_s3
in my flow schematic, and does not create the relationship I am looking for. Any advice?
k
Yeah. Store
write_data_to_s3
in a variable and then point to that.
Copy code
with Flow(...) as flow:
    a = first_task()
    b = second_task()
    c = third_task(c_inputs, upstream_tasks=[a,b])
Or you can do:
Copy code
with Flow(...) as flow:
     a = first_task()
     b = second_task()

     c = third_task()
     c.set_upstream(b)
     c.set_upstream(a)
b
Does it matter that
second_task
and
third_task
both accept the same
parameter
value?
k
I don’t think it should. You just get another line in your DAG schematic
b
In your example, if
a
is an input into
third_task
, is it necessary to also set BOTH
a
and
b
as upstream tasks?
k
No because the dependency from
a
will already be built so I think you should only need
b
. This is a hard think though
b
Yea I thought I was doing it right... I was getting an error that I was calling
write_data_to_s3
without providing required arguments, which felt completely wrong. Trying again and will report back. Thank you!
k
So with:
Copy code
with Flow("ex") as flow:
    a = first_task()
    b = second_task()
    c = third_task(a, upstream_tasks=[b])
I got:
b
This is what I wanted!
Much appreciated @Kevin Kho
k
Of course!