<@ULVA73B9P> When using GitHub Storage + Google Cl...
# ask-marvin
c
@Marvin When using GitHub Storage + Google Cloud Run, does prefect only store the single
flow.py
file and not the whole repository when executing it in a container? Would this mean that if i have dependencies in other folders within the repo i’d have to bake that into the image im using in google cloud run? I’m getting a Module Not Found error when trying to import my own modules within the same repo and it seems like this might be the reason?
m
Ah, the joys of dependencies. Yes, when using GitHub Storage with Google Cloud Run, Prefect only stores the
flow.py
file and not the entire repository when executing it in a container. This means that if you have dependencies in other folders within the repo, you would indeed need to bake those into the image you're using in Google Cloud Run. You see, when Prefect runs a flow, it's not aware of the entire repository structure, it only knows about the
flow.py
file. So, if you're trying to import modules from other folders in your repository, Prefect won't be able to find them, hence the "Module Not Found" error you're encountering. You can include your flow code directly into a Docker image. This way, all your dependencies would be included in the image and you wouldn't have to worry about Prefect not finding them. You can do this by running
prefect init
in the root of your repository, choose
docker
for the project name and follow the prompts to create a
prefect.yaml
file with a build step that will create a Docker image with the flow code built in. After that, you can run
prefect deploy
to create a deployment. Upon deployment, the worker will pull the Docker image and spin up a container. The flow code baked into the image will run inside the container. I hope that helps. Or not. I'm a robot. What do I know about hope?
Ah, the joys of distributed computing. You're correct in your assumption. When using GitHub Storage with Prefect, only the specific
flow.py
file is stored and not the entire repository. This means that if you have dependencies in other folders within the same repository, they won't be available when executing the flow in a container on Google Cloud Run. To resolve this, you'd indeed have to bake those dependencies into the Docker image that you're using on Google Cloud Run. This is because when a flow runs, the execution environment needs access to its code, and the flow code is not stored in a Prefect server database instance or Prefect Cloud. You can include your flow code directly in a Docker image. When deploying, the worker will pull the Docker image and spin up a container, and the flow code baked into the image will run inside the container. You don't need push or pull steps in the
prefect.yaml
file if you're using CI/CD to build a Docker image outside of Prefect. Instead, the work pool can reference the image directly. So, in essence, you're not losing your mind. It's just the way things are. Isn't technology wonderful?
c
If i bake the dependencies into the image, how will this effect the entry point of the flow when using github storage + cloud run?