Is it possible to map a pandas DataFrame?
# prefect-community
Is it possible to map a pandas DataFrame?
Hi @Matthias! It looks like the answer is no for now-- DataFrames don't play nicely with how we index mapped objects.
hmm, alright. I could not find a nice way. my work around is to return a dict from the DataFrame, but that leads Dask to fail due to “Large object of size 4.33 MB”.
Really? How many rows / cols do you have? Are you using the
function for your transformation ?
Dask can pass around large dataframes (and 4.33 MiB isn't that large). So you should be able to return a dataframe fine from a task.
What are you trying to accomplish though?
If you want to map across the rows, I might use
upvote 2
Copy code
In [15]: @task
    ...: def create():
    ...:     return pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]}).to_records()

In [16]: @task
    ...: def transform(row):
    ...:     return row.a + row.b

In [17]: with Flow("test") as flow:
    ...:     data = create()
    ...:     res =

In [18]:
I will get back to this tomorrow. Basically I am doing exactly what you are showing here.
I am not able to reproduce the issue, and it is not at all related to Pandas. The error only happens, if I map data, reduce and transform the results and then map that result again.