Hey, I'm experimenting with prefect and dask worke...
# prefect-community
h
Hey, I'm experimenting with prefect and dask workers running on multiple servers and trying to achieve the following: I run a flow from time to time that uses all the workers and would like to cache the results for future flow runs so that servers can access each others cache. The servers do not have a shared drive, and I can not bind the task to a specific server either. Based on https://github.com/PrefectHQ/prefect/issues/2636, having this kind of distributed cache is not possible currently out of the box with dask, or am I missing some crucial piece of prefect knowledge?
👀 1
z
Hi! I’d recommend using something like S3 for your
Results
— they don’t need to be on the server’s disk. https://docs.prefect.io/api/latest/engine/results.html#prefectresult
h
Thanks, but unfortunately, cloud services are not an option either 😅
z
My best recommendation is to host something S3 compatible then on one of your servers 🙂 perhaps MinIO https://docs.min.io/docs/minio-quickstart-guide.html
👍 1
h
Okay, thank you. I will consider it
z
Basically that or a NFS — to persist results across flow runs you’ll need somewhere to put them. You could investigate subclassing
Result
to store them on your dask cluster but I think the disk is a safer bet.