Hey, I'm experimenting with prefect and dask worke...
# prefect-community
Hey, I'm experimenting with prefect and dask workers running on multiple servers and trying to achieve the following: I run a flow from time to time that uses all the workers and would like to cache the results for future flow runs so that servers can access each others cache. The servers do not have a shared drive, and I can not bind the task to a specific server either. Based on https://github.com/PrefectHQ/prefect/issues/2636, having this kind of distributed cache is not possible currently out of the box with dask, or am I missing some crucial piece of prefect knowledge?
👀 1
Hi! I’d recommend using something like S3 for your
— they don’t need to be on the server’s disk. https://docs.prefect.io/api/latest/engine/results.html#prefectresult
Thanks, but unfortunately, cloud services are not an option either 😅
My best recommendation is to host something S3 compatible then on one of your servers 🙂 perhaps MinIO https://docs.min.io/docs/minio-quickstart-guide.html
👍 1
Okay, thank you. I will consider it
Basically that or a NFS — to persist results across flow runs you’ll need somewhere to put them. You could investigate subclassing
to store them on your dask cluster but I think the disk is a safer bet.