https://prefect.io logo
Title
c

Chris Hart

04/06/2020, 8:15 PM
looking to use prefect with dask-ml on the default DaskExecutor.. how might I get the sklearn bits to use dask as the joblib backend as referenced here: https://ml.dask.org/joblib.html ?
or any docs about generally hooking into the underlying dask cluster from prefect tasks
(rn I'm just flying blind and as a first step would attempt to hook them both up to the same cluster as shown by https://docs.prefect.io/core/advanced_tutorials/dask-cluster.html) 🤞
j

Jim Crist-Harif

04/06/2020, 8:33 PM
Hi Chris, you should be able to access the underlying dask cluster in a task by using
dask.distributed.worker_client
. See the dask docs here: https://distributed.dask.org/en/latest/task-launch.html#connection-with-context-manager
🎉 2
However, if you're just trying to use the joblib backend and not dask directly, you should be able to follow the joblib example you linked directly - the underlying joblib backend sets up a client on the worker for you.
e.g. the following should work in a prefect task without needing to create a client yourself:
with joblib.parallel_backend('dask'):
    search.fit(digits.data, digits.target)
c

Chris Hart

04/06/2020, 8:35 PM
woohoo thanks for the pointers!
j

Jim Crist-Harif

04/06/2020, 8:35 PM
No problem, happy to help
m

Mark Koob

05/09/2020, 12:26 PM
For anyone showing up here from the future - using
distributed.worker_client()
does give the correct behavior when launching a
dask-ml
search from inside a task when running on a dask cluster!
@task
def search_model(searchcv, x, y):
    with worker_client() as client:
        return searchcv.fit(x, y)
Thanks Jim!