Hey team - I have a prefect 1 flow in which one of the tasks is being called twice even though I only have one invocation of it. I will post a mock snippet in the thread.
Mitch
05/31/2023, 3:51 PM
Copy code
df_inc = inception(accounts) #inception is a task
UploadSQLTask(task_args={'trigger': all_successful},
df=df_inc,
unlinkFile=False,
upstream_tasks=[inception]
)
Is there any reason this is running the inception task twice when I only want to run once?
z
Zanie
05/31/2023, 3:56 PM
Looks like you want
upstream_tasks
to be
df_inc
m
Mitch
05/31/2023, 3:57 PM
df_inc is just the return value, not the task. I will try it out!
z
Zanie
05/31/2023, 4:06 PM
The return value is representative of the task while you are constructing a flow
Zanie
05/31/2023, 4:07 PM
You don’t actually need
upstream_tasks
at all here since you are passing
df_inc
as data
m
Mitch
05/31/2023, 4:08 PM
Got it. I'll try it out and see if it works as expected
Mitch
05/31/2023, 4:08 PM
Thanks for the help @Zanie
d
Deceivious
05/31/2023, 4:46 PM
But i don't think that the upstream tasks keyword Param has anything to do with the duplication on this case?
z
Zanie
05/31/2023, 4:47 PM
Does it not? It seems likely that passing a
Task
object there is adding a second instance of that task to the graph.
Zanie
05/31/2023, 4:48 PM
i.e. we construct a graph of tasks by looking at the declaration of your flow. If you pass
inception
via
df_inc
as well as
inception
via
upstream_task
we’re going to have to copies of that task in the graph.
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.