Andrew Rosen
07/25/2023, 1:26 AMprefect-dask
and dask-jobqueue
to launch Prefect flows to a job queuing system on an HPC cluster. However, the compute nodes don't have network connectivity.
Is there any hope of using Prefect Cloud in this scenario? Just curious if I'm missing something obvious.Nate
07/25/2023, 2:30 AMAndrew Rosen
07/25/2023, 2:33 AMprefect-dask
DaskTaskRunner
allows me to submit flows from the login node to the compute node where they will run. This works perfectly fine. But because the compute nodes have no network connection, it crashes at the end because it can't report back to Prefect Cloud about the result of the flow run.
While in principle I could run a Prefect Server on the login nodes, the HPC staff won't approve of that because it would impact other users. So, I'm having trouble seeing if there is a way for me to use Prefect in this scenario.
I was able to use Prefect on other HPC machines that do have network connectivity on the compute nodes, but that is a rarity.Nate
07/25/2023, 2:38 AMAndrew Rosen
07/25/2023, 2:40 AMseems possible in principle to run a worker on the login node that would submit work to compute nodes while it communicates with prefect cloudyeah I haven't seen many examples either. I'll keep experimenting. I got close, but it requires a combination of
prefect
, prefect-dask
, and dask-jobqueue
so there aren't many people with knowledge about it...Nate
07/25/2023, 2:43 AMAndrew Rosen
07/25/2023, 2:44 AMDaskTaskRunner
. I feed this DaskTaskRunner
as a task_runner
argument to the @flow
, and if I do that from a login node, it will spin up a Dask cluster on the compute nodes and submit the flow for execution. The problem is when the results need to be reported back to Prefect Cloud.
anyway, not a problem! 🙂 I know this isn't the common usage scenario. definitely let me know if you find any info!Nate
07/25/2023, 2:54 AMService
level work for you, but I can definitely see if anyone has more context on a setup like this internally!Andrew Rosen
07/25/2023, 2:54 AM