Hugo Slepicka
10/13/2021, 9:28 PMZach Angell
Kevin Kho
Hugo Slepicka
10/13/2021, 11:55 PMKevin Kho
Kevin Kho
dask.jobqueue.SLURMCluster
. Have you tried it? I outlined my thoughts hereAn Hoang
11/10/2021, 9:53 PMDaskExecutor
connected to a dask cluster from dask.jobqueue.LSFCluster
(similar to dask.jobqueue.SLURMCluster
) but am encountering alot of problems so that would help HPC users test and triage to see if issues is in the pipeline code, the HPC or the Dask ClusterAn Hoang
11/10/2021, 9:56 PMSlurmExecutor
would have to persist all data in files, then submit tasks via command line jobs (no python API similar to LSF IIRC)Kevin Kho
Philip MacMenamin
01/25/2022, 8:11 PMPhilip MacMenamin
01/25/2022, 8:12 PMimport prefect
from prefect import task, Flow
import dask
import dask_jobqueue
from dask_jobqueue import SLURMCluster
from prefect.executors import DaskExecutor
def SLURM_exec():
cluster = SLURMCluster()
logging = prefect.context.get("logger")
logging.debug(f"Dask cluster started")
logging.debug(f"see dashboard {cluster.dashboard_link}")
return cluster
@task
def hello_task():
logger = prefect.context.get("logger")
<http://logger.info|logger.info>("Hello!")
with Flow("example", executor=DaskExecutor(cluster_class = SLURM_exec)) as flow:
hello_task()
Philip MacMenamin
01/25/2022, 8:12 PMKevin Kho
Kevin Kho
Philip MacMenamin
01/25/2022, 8:53 PMI'm seeing this log:
20:48:37
INFO
agent
Submitted for execution: PID: 2061
20:48:39
INFO
CloudFlowRunner
Beginning Flow run for 'example'
20:48:40
INFO
DaskExecutor
Creating a new Dask cluster with `None.SLURM_exec`...
20:48:41
INFO
DaskExecutor
The Dask dashboard is available at <http://xxx:46778/status>
I'm seeing the job get started, but no workers getting started.Philip MacMenamin
01/26/2022, 3:56 PMens160
and spotted this:
ValueError: 'ens160' is not a valid network interface. Valid network interfaces are: ['lo', 'ethbond0', 'ibbond0', 'idrac', 'em1', 'em2', 'ib0', 'ib1']
I'll try to get a minimal example up for you guys.Kevin Kho