Dekel R
06/07/2022, 5:57 PM@task()
def task_a():
df = pd.Dataframe()
......
<http://logger.info|logger.info>(df.shape) # 19M and 13 cols
return df
@task()
def task_b(df):
<http://logger.info|logger.info>(type(df)) # Nonetype
with Flow('****',
storage=Docker(registry_url="us-central1-docker.pkg.dev****/",
dockerfile="./Dockerfile"), executor=LocalDaskExecutor(scheduler="processes")) as flow: #
df = task_a()
task_b(df)
Kevin Kho
06/07/2022, 8:13 PMDekel R
06/08/2022, 6:39 AMKevin Kho
06/08/2022, 6:44 AMDekel R
06/08/2022, 7:18 AMdf = pd.read_parquet(....)
pickled_df = cloudpickle.dumps(df)
depickled_df = pickle.loads(pickled_df)
Kevin Kho
06/08/2022, 2:38 PMDekel R
06/08/2022, 2:41 PMKevin Kho
06/08/2022, 2:47 PMDekel R
06/08/2022, 2:48 PMKevin Kho
06/08/2022, 3:22 PMpip show dask
Dekel R
06/08/2022, 5:19 PMKevin Kho
06/08/2022, 6:01 PM