Is it possible to run dask code within prefect tas...
# prefect-community
t
Is it possible to run dask code within prefect task? Is there a way to get submit work to the dask cluster that runs my flow (KubeCluster to LocalDaskCluster)? I am looking for a way to make my core functionality rely on dask and still be able to leverage the orchestration through prefect...
a
That's definitely possible. I realized this is a common question so I wrote this more extensive Discourse topic to explain it in more detail for you and for posterity: https://discourse.prefect.io/t/how-can-i-configure-my-flow-to-run-with-dask/45 In the section Prefect 1.0, there is an explanation of two ways you can use Prefect and Dask together: 1. Dask executor + mapping 2. Dask cluster set up with Prefect resource manager There is also this comparison of LocalDaskExecutor vs. DaskExecutor linked there: https://discourse.prefect.io/t/what-is-the-difference-between-a-daskexecutor-and-a-localdaskexecutor/374 And for your second question regarding
KubeCluster
, check out this thread: https://discourse.prefect.io/t/are-there-any-guidelines-for-using-a-temporary-vs-static-dask-cluster-how-to-set-one-on-kubernetes/342 LMK if you still have any questions about that, happy to help more.
šŸ”„ 1
šŸ‘‹ 1
šŸ§ 1
k
Also see this
šŸ‘€ 1
t
I was just coming here to reply that the links @Anna Geller sent are too high-level and do not give me the ability to "highjack" the dask cluster workers but then I saw the link @Kevin Kho shared with seems more like what I was trying to do. If I understand this correctly, it means I can start a flow on top of dask executor (my aim is to run it on k8s) and then submit additional tasks to the same cluster for execution. Am I correct? Do you know if there are implications to doing this? How would doing this (high-jacking the dask cluster) compares to using orion in order to do arbitrary tasks?
a
Why do you think resource manager is too high level? šŸ™‚ you can think of it as a ā€œnormalā€ context manager. The issue that Kevin shared is showing the same approach but without the benefits of a Prefectā€™s resource manager:
Copy code
@task
def compute_describe(df):
    with worker_client():
        return df.describe().compute()
vs:
Copy code
# use ResourceManager object
   with DaskCluster(
           cluster_type=cluster_type,
           n_workers=n_workers,
           software=software,
           account=account,
           name=name
           ) as client:
       push_events = get_github_data(filenames)
       df = to_dataframe(push_events)
       to_parquet(df)
t
Ah, I guess I missed that šŸ™ˆ
šŸ‘ 1
What are the advantages to use the resource manager here? In the example Kevin sent, by using worker_client, I am getting the dask cluster that the flow is running on, right? With the resource manager example - does it not create an "standalone"/"additional" cluster (depending of the flow runs in dask executor)
a
It depends on how you configure the cluster - it could either create a new one or use a standalone cluster. Either way, resource manager helps you manage the resources so that if something fails in your code, the cleanup method is executed rather than leaving some resources unhandled (exiting without closing DB connections, HTTP sessions, cluster resources, etc.) - check out this docs and this blog post for more
t
If I understand correctly, assuming I use dask cluster over k8s (KubernetesRun? KubeCluster?), using resource manager similar to the example you sent, will not give access to the same ad-hoc cluster created for the job. In the example @Kevin Kho shared, assuming the task is already running on some dask cluster, it will give access to the same cluster the task is running on. Am I correct?
a
Maybe it would help if you build an example and check for yourself? the resource manager is an abstraction to clean up some resources. To be honest, if you already have a long running cluster on K8, you don't need that
t
I am trying to use
worker_client
with Orion and I am getting an error - not sure whether it should even work. Here is what I tried:
Copy code
from dask.distributed import Client, worker_client
from prefect import task, flow
@task
def my_task(*args, **kwargs):
    with worker_client() as client:
        fut = client.submit(add, 30)
        return fut.result()

@flow(name="My Example Flow",
            ask_runner=DaskTaskRunner())
def my_flow(*args, **kwargs):
    # run parallel tasks and subflows with Dask
    return my_task()
Then, running this (all in ipython), I am getting this output:
Copy code
In [37]: state = my_flow()
10:01:41.229 | INFO    | prefect.engine - Created flow run 'cherubic-bloodhound' for flow 'My Example Flow'
10:01:41.229 | INFO    | Flow run 'cherubic-bloodhound' - Using task runner 'DaskTaskRunner'
10:01:41.230 | INFO    | prefect.task_runner.dask - Creating a new Dask cluster with `distributed.deploy.local.LocalCluster`
10:01:42.100 | INFO    | prefect.task_runner.dask - The Dask dashboard is available at <http://127.0.0.1:8787/status>
10:01:42.174 | INFO    | Flow run 'cherubic-bloodhound' - Created task run 'my_task-ec9685da-0' for task 'my_task'
10:01:44.861 | ERROR   | Task run 'my_task-ec9685da-0' - Encountered exception during execution:
Traceback (most recent call last):
  File "/Users/tomercagan/dev/orion/venv/lib/python3.9/site-packages/prefect/engine.py", line 703, in orchestrate_task_run
    result = await run_sync_in_worker_thread(task.fn, *args, **kwargs)
  File "/Users/tomercagan/dev/orion/venv/lib/python3.9/site-packages/prefect/utilities/asyncio.py", line 51, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(context.run, call, cancellable=True)
  File "/Users/tomercagan/dev/orion/venv/lib/python3.9/site-packages/anyio/to_thread.py", line 28, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable,
  File "/Users/tomercagan/dev/orion/venv/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 818, in run_sync_in_worker_thread
    return await future
  File "/Users/tomercagan/dev/orion/venv/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 754, in run
    result = context.run(func, *args)
  File "<ipython-input-35-75f9eaa5372f>", line 4, in my_task
  File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/Users/tomercagan/dev/orion/venv/lib/python3.9/site-packages/distributed/worker_client.py", line 55, in worker_client
    duration = time() - thread_state.start_time
AttributeError: '_thread._local' object has no attribute 'start_time'
/Users/tomercagan/dev/orion/venv/lib/python3.9/site-packages/prefect/client.py:1227: UserWarning: No default storage has been set on the server. Using temporary local storage for results.
  warnings.warn(
10:01:44.901 | ERROR   | Task run 'my_task-ec9685da-0' - Finished in state Failed('Task run encountered an exception.')
/Users/tomercagan/dev/orion/venv/lib/python3.9/site-packages/prefect/client.py:1227: UserWarning: No default storage has been set on the server. Using temporary local storage for results.
  warnings.warn(
10:01:45.354 | ERROR   | Flow run 'cherubic-bloodhound' - Finished in state Failed('1/1 states failed.')
Am I doing something wrong, or this is not possible in orion?
k
Thanks for the traceback. Will have to ask an engineer. Iā€™m not sure if this is supported yet.
t
The plan is that Orion will support all the features of 1.0? I am specifically wondering about results / artifacts - the team I work with really liked it
k
Results I am sure. Artifact might be further out but letā€™s wait for someone else to chime in
z
We havenā€™t tested it, but thereā€™s no reason it shouldnā€™t be working. Looks like itā€™s upset about retrieving the client in a worker thread, which we can disable when running sync tasks with the Dask task runner.
If you switch to an async task definition, I presume itā€™ll just work.
Weā€™ll add unit tests for this and make sure it works eventually.
šŸ™ 1
t
Generally I would like to keep away from async as much as possible as it kind of confused the team (mixed skill levels). I will give it a try though. Thanks
z
Thatā€™s fair šŸ™‚ you can actually use async tasks from a sync flow without changing the call so itā€™d just be a change to the task function definition.
It may be naughty to run blocking code in the task though since I think itā€™ll run on Daskā€™s event loop.
Anyway, weā€™ll fix this in one of the next couple releases.