Hi, the KubernetesJob docs say that you need to have a remote file store configured to use it. And the Storage docs say that none of the remote file storage libraries are installed by default. Now my flow stored in S3 is failing to deploy via the k8s agent because of the missing file store libraries. How do you recommend including something like s3fs in the KubernetesJob environment, since it isn’t included in the default package set?
The Storage docs say that you can use
EXTRA_PIP_PACKAGES
for the Docker executor. Is there something similar for the k8s job executor?
i
Ilya Galperin
08/11/2022, 7:58 PM
I ran into the same issue and was not able to get
EXTRA_PIP_PACKAGES
to work using the
kubernetes-job
infrastructure. In our case, nearly all of our flows rely on custom images that vary from flow to flow, so my workaround is just including
s3fs
(or whatever other file store) in the base Dockerfile that our custom images are built off of.
Ilya Galperin
08/11/2022, 7:59 PM
Then pointing to that image in the infrastructure block.
Ilya Galperin
08/11/2022, 7:59 PM
This might be faster than installing at runtime I’d think as well
m
Mars
08/11/2022, 8:15 PM
I’m in the same boat: most of our projects need a few custom libraries, too. I was hoping to stand up a Prefect proof-of-concept without doing the whole container-build->artifact-repo->flow deployment process, but I guess I can’t avoid it. 🤷 Thanks!
a
Anna Geller
08/11/2022, 11:22 PM
fwiw if this is easier for you, you could install those packages into target and upload it alongside your flow
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.