Fun with dataframes and map. So I have a map funct...
# ask-community
d
Fun with dataframes and map. So I have a map function that loads a dataframe in one task and then uses that dataframe in the next task. When I look at the return type on the first task, it is pandas.core.frame.DataFrame. When I look at the type of the dataframe value in the second task, it is class 'list'. I have the type hint set to pandas.DataFrame in the second task. Does anyone have any idea why the DataFrame is becoming a list?
k
The hint is not read and applied. It’s just a hint. Is your second task a map too?
d
The second task isn't a map. Teh flow is set of file to map, read file into datafile, pass datafile to next task.
k
If task one is mapped, and task two is not mapped, it will return a list of the mapped items to reduce it
Copy code
@task
def abc(x):
    return x+1

@task
def bcd(list_x):
    sum(list_x)

with Flow(...) as flow:
    a = abc.map([1,2,3,4,5])
    b = bcd(a)
a will be of type list here because it was the output of map. and
bcd
will take in a list because it is a
reduce
step
d
OK Thanks. I think I will have to combine the subsequent tasks into one to use the map.
This is all an attempt to add concurrency to the tasks in a flow. Maybe I need to look at flow of flows with Parameters instead?
QQ: If in your example, b = bdc.map(a) would the map essentially iterate over a? i.e. in my case, would doing an additional map allow me to access the dataframes as elements of a map list?
Never mind. I discovered that I was correct. Thanks for your help.