Marwan Sarieddine
01/13/2022, 7:44 PMResourceManager
and using set_upstream
How can I ensure that a task is run after the resource manager cleanup ? (Please see the thread for more details)Marwan Sarieddine
01/13/2022, 7:45 PMwith Flow("dask-example") as flow:
n_workers = Parameter("n_workers", default=None)
out_path = Parameter("out_path", default="summary.csv")
with DaskCluster(n_workers=n_workers) as client:
# These tasks rely on a dask cluster to run, so we create them inside
# the `DaskCluster` resource manager
df = load_data()
summary = summarize(df, client)
# This task doesn't rely on the dask cluster to run, so it doesn't need to
# be under the `DaskCluster` context
write_csv(summary, out_path)
Marwan Sarieddine
01/13/2022, 7:45 PMwrite_csv
to only run after summarize
is run and more importantly after the DaskCluster.cleanup
is run ….Marwan Sarieddine
01/13/2022, 7:47 PMwith Flow("dask-example") as flow:
n_workers = Parameter("n_workers", default=None)
out_path = Parameter("out_path", default="summary.csv")
with DaskCluster(n_workers=n_workers) as client:
# These tasks rely on a dask cluster to run, so we create them inside
# the `DaskCluster` resource manager
df = load_data()
summary = summarize(df, client)
# This task doesn't rely on the dask cluster to run, so it doesn't need to
# be under the `DaskCluster` context
written = write_csv(summary, out_path)
written.set_upstream(summary)
Given this won’t set cleanup_task
as an upstream to written
…Marwan Sarieddine
01/13/2022, 7:52 PMwith Flow("dask-example") as flow:
n_workers = Parameter("n_workers", default=None)
out_path = Parameter("out_path", default="summary.csv")
client = DaskCluster(n_workers=n_workers)
with client:
# These tasks rely on a dask cluster to run, so we create them inside
# the `DaskCluster` resource manager
df = load_data()
summary = summarize(df, client)
# This task doesn't rely on the dask cluster to run, so it doesn't need to
# be under the `DaskCluster` context
written = write_csv(summary, out_path)
written.set_upstream([summary, client.cleanup_task])
is this the recommended approach ?Kevin Kho
Marwan Sarieddine
01/17/2022, 1:52 PM