I have a SLURM cluster and can successfully run Da...
# prefect-getting-started
j
I have a SLURM cluster and can successfully run Dask on it and distribute tasks. When I try to use local Prefect via
DaskTaskRunner
(specifying the cluster scheduler address) I get the following error:
Copy code
Exception: PrefectHTTPStatusError("lient error '404 Not Found' for url '<http://ephemeral-prefect/api/task_runs/2832dc55-e6d8-4235-a6a5-ff4514b47067/set_state>
Based on https://docs.prefect.io/latest/concepts/infrastructure/:
• The ephemeral Prefect API won't work with Docker and Kubernetes. You must have a Prefect server or Prefect Cloud API endpoint set in your agent's configuration.
The nodes on the SLURM cluster are using python via a Docker image. Is it the case that local Prefect execution just doesn't work with a remote Dask cluster and as per above I should use Prefect Cloud/Prefect server (was trying to start as simple as possible)?
1
Also happy to consider any other ways to integrate prefect with SLURM (but via Dask seemed the best route to take)!
j
I think the issue here is the distributed aspect. It looks like you're using the ephemeral api with sqlite (the default). So the task run is created in a db where you're initiating your process. and then that id (
2832dc55...
) does not exist in the sqlite lite db that is being created on wherever your work is running.
j
Thanks, that makes sense, sounds like I need to move to something like Postgres or try Prefect Cloud
j
yepyep. You can either: • keep using the ephemeral api, but your database has to be available to connect to from all the different places • use prefect cloud/host your own prefect api server
1