Dear all, as a newcomer evaluating Prefect I am st...
# ask-community
v
Dear all, as a newcomer evaluating Prefect I am struggeling a bit to understand exactly when I may need Prefect server and/or Prefect cloud. Our main application will involve be processing array data (dask/numpy arrays, mostly representing images in the life sciences) from high-throughput imaging experiments. We hope that dask will help us • distribute the processing across HPC nodes (Slurm cluster) • help us to save intermediate results and restart processing runs from intermediate results if the processing has been cancelled or terminated due to an error. From my initial reading I thought that we would have to at least set up a Prefect server to handle distributed computing, but from initial experiments it appears that simply using a
DaskExecutor
can handle the distribution across nodes even if no Prefect server is running. Is this correct or am I overlooking something important here? We don't necessarily need all the Web UI things such as the nicely formatted logs and status and the ability to trigger/run registered workflows from the web UI by clicking a button.
k
Hey @Volker Hilsenstein, you are right that the DaskExecutor will handle the distributed computation. The UI is meant to help with orchestration and Prefect Core may not be enough to have all the functionality for orchestration
v
Thanks @Kevin Kho, very useful to know we can handle the distributed computation with dask. To go into more specifics: would re-using intermediate (serialized) results fall under "orchestration" in your answer? I often associate "orchestration" with spinning up and shutting down containers according to schedules or demand, but not sure whether that is the correct definition.