Bivu Raj

07/16/2020, 6:21 PM
It might be a strange question, but what is the relationship between docker storage and dask cluster? What i mean is, even if have all the dependiencies satisfied via docker storage, how will the workes in dask resolve that? Does cloud pickle handle that?
đź‘€ 1

Laura Lorenz (she/her)

07/16/2020, 7:30 PM
Hi @Bivu Raj! Docker storage really has more to do first with your agent. If for example you are using the Kubernetes agent, it needs to start the flow as a kubernetes pod, which means it needs some container that has the flow in it since that is how kubernetes works. By having configured Docker storage on the flow, during the flow.register() step you will have told Prefect (and thus your agent) where your Docker image is so it can pull it to start the flow job with. After that, if you have a separate Dask cluster that is not configured with Prefect (for example with Prefect’s DaskKubernetesEnvironment) then the flow storage doesn’t guarantee your workers will have all the dependencies — you will need to guarantee that within your Dask cluster.


07/17/2020, 8:33 AM
@Severin Ryberg [sevberg] I think that also answers some of our remaining questions 🤔