Blake List
07/12/2021, 10:40 PMdf = df.apply(my_function, axis=1)
with prefect.
Thanks!nicholas
Kevin Kho
import pandas as pd
from prefect import task, Flow
from typing import Dict
import prefect
test = pd.DataFrame({'a': [1,2,3,4,5], 'b': [1,2,3,4,5]})
@task
def my_func(record: Dict):
record['c'] = record['a'] + record['b']
logger = prefect.context.get("logger")
<http://logger.info|logger.info>(record)
return record
@task
def get_records(df: pd.DataFrame):
return df.to_dict(orient='records')
@task
def assemble(records):
df = pd.DataFrame.from_dict(records)
logger = prefect.context.get("logger")
<http://logger.info|logger.info>(df.head())
return
with Flow("df_test") as flow:
records = get_records(test)
processed = my_func.map(records)
assemble(processed)
flow.run()
But this is only recommended when you need observability per row. If you don’t you should just use the pandas apply inside a task.Blake List
07/13/2021, 4:19 AMBlake List
07/13/2021, 4:26 AMKevin Kho
Blake List
07/13/2021, 4:55 AM