I can't find a consistent answer on how to get pre...
# ask-community
j
I can't find a consistent answer on how to get prefect to run on a coiled-managed dask cluster. Will a DaskExecutor start the named cluster for me? Does it already need to be running? How do I give prefect.io credentials for coiled.io? Should I start the cluster inside a task, before I start the flow, or what?
k
Hey @Joseph Mathes, Coiled has an example of this in their docs. The Prefect DaskExecutor can point to a long-running cluster, or it can spin up an ephemeral cluster for the Flow by passing in a callable.
It is not in the tasks. It happens on the flow like this:
Copy code
executor = DaskExecutor(
    cluster_class=coiled.Cluster,
    cluster_kwargs={
        "software": "kvnkho/prefect",
        "shutdown_on_close": True,
        "name": "prefect-cluster",
    },
)
flow.executor = executor
flow.register(...)
Real script example here . So it’s the agent or whatever is spinning up the cluster that needs to be logged in to Coiled so that it can start it.
j
https://docs.coiled.io/user_guide/examples/prefect.html Here is an example of a task that spins up a dask cluster as part of the tqask
Hey, I recognize that username

https://www.youtube.com/watch?v=awXYHAkY2To&t=129s

Is this you?
This is even the code you linked me to
Yeah, I copied all of this verbatim. I have accounts on prefect.io and coiled.io, a dask executor defined, and a local agent running
the task shows as pending indefinitely in prefect.io
Copy code
@prefect.task
def transform():
    # Create and connect to Coiled cluster
    cluster = coiled.Cluster(n_workers=10, name="prefect-task")
    client = Client(cluster)
^ This is the part that's spinning up a cluster using a coiled api during a task execution
I think it's the ephemeral style you were talking about
k
Yes that is me. I see what you mean with task level. But this is also ephemeral. You pass a callable to the executor and then Prefect will spin it up when your Flow starts.
Copy code
executor = DaskExecutor(
    cluster_class=coiled.Cluster,
    cluster_kwargs={
        "software": "kvnkho/prefect",
        "shutdown_on_close": True,
        "name": "prefect-cluster",
    },
)
flow.executor = executor
For task level, I suppose you can do that. It should work. We also have Resource Managers . If the task shows as pending indefinitely, it might be that your cluster really just failed to spin up, which happens. You can check that you are not using spot instances and check if there was a cluster in the Dashboard and if it started successfully.
Did I use Coiled in that video? Man I forgot
j
you didn't
k
Man I probably ran out of time. I see the tab open.
j
I think you mentioned it
maybe you set up dask config to use coiled? I was rewinding and jumping around a lot
DaskExecutor( cluster_class=coiled.Cluster, cluster_kwargs={ "name": cluster_name, "account": coiled_account, "software": software_env_name, "shutdown_on_close": False, "n_workers": workers, "worker_options": { "nthreads": 1, }, "backend_options": { "region": region, }, }, )
^ This is what I have
k
This one uses it:

https://www.youtube.com/watch?v=gbzL5TIFZZY

j
you have "username/cluster_name", but passing "account" separately is supported according to some doc I found. They all blend together
k
Timestamp 46:45 of that video is the Coiled part.
j
thanks; maybe it'll help
FWIW I'm not using conda or pip args to create a docker image. Instead I'm using an ECR uri, and I have coiled configured to get into that account
my coiled dashboard shows a worker coming up, which is what I asked for, but prefect.io doesn't see it
k
Does it ever materialize with 4/4 workers or does it just say 0/4?
j
I know I'm not giving enough information to debug. There's just too much context here to paste it all. I've been trying things for weeks
for me, n_workers=1
k
Ah I’m more of trying to find out if it’s on the Prefect side or Coiled side because they have logs in their backend of your cluster spin up and can tell you better why it’s not spinning up if that’s the issue
j
on coiled.io, using DaskExecutor, I get num-workers at 0/1 and staying that way. previously i was using a coiled API to start a cluster before I started trying to run a flow
k
Yeah that sounds like the cluster is not spinning up. I would check in with them.
Are you part of their Slack channel?
j
not yet
k
j
thanks!
k
Yep, let me know what they say (though I’ll keep an eye out for your message there)
j
Seems like the tip about looking in coiled logs might have been pay dirt
k
What did you find about that constructor?
g
We haven't helped @Joseph Mathes much, @Kevin Kho - this thread is really helpful context though