Hello everyone , i am using prefect and dask resource manager and running some tasks within Dask resource manager context and few outside the context manager.
Flow looks like this
Copy code
with Flow("Test flow") as flow:
with DaskCluster(n_workers=n_workers) as client:
data = extract()
processed_data= transform(data)
save_data(processed_data)
The problem is dask cluster is not shutting until save_data function completes but i expected that as soon as
transform
function completes , the
cluster cleanup
should happen,
Is there any way i can initiate
save_data()
after
transform()
is done and the
cluster_cleanup
is done as well.
t
Tim-Oliver
11/23/2022, 11:22 AM
I have not tried this, but wouldn't the sub-flow pattern solve this?
a
ash
11/23/2022, 11:23 AM
i expected the same , but my dask cluster stays up uptil the last function also completes
ash
11/23/2022, 11:25 AM
for your reference , schematic looks like this
t
Tim-Oliver
11/23/2022, 11:26 AM
Interesting, looking forward to read what the prefects are saying 🙂
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.