https://prefect.io logo
d

David Michael Carter

02/28/2022, 5:03 PM
I have a task which I am attempting to parallelize. The function I am mapping returns a small dataframe. I need to append all of these small dataframes into a final result which i will send to the next task. So far, I am attempting to append to a global DF within the flow steps (no luck):
Copy code
finalDF = finalDF.append(foo.map(aList))

TypeError: cannot concatenate object of type '<class 'prefect.tasks.core.function.FunctionTask'>'; only Series and DataFrame objs are valid
Am I on the right track here, or do I need a completely new approach?
k

Kevin Kho

02/28/2022, 5:07 PM
Hi @David Michael Carter, I think you need to do
foo.map()
and then if you get a list of DataFrames, you’ll need another task for that logic. There is a distinction between build time and runtime. Tasks are evaluated during runtime. This append is evaluated during build time when
foo.map
hasn’t materialized yet. You likely need to encapsulate this logic in a task and call the append inside a task to defer the execution
d

David Michael Carter

02/28/2022, 5:50 PM
@Kevin Kho does it again! Thanks dude. The function ended up not being parallel-izable, but the mapping worked and simplified my code a little bit. Good to know the result of
foo.map()
is a list obj
👍 1
6 Views