Hello, when using GitHub based storage is it possible to use other files located on the repo within tasks?
I ask because when I check the temporary directories where I suspect the repo to be cloned it seems no files from the repo exist.
k
Kevin Kho
12/09/2021, 6:26 PM
Hey @William Clark, Git storage lets you load YAML and SQL files but not Python files. The git based storages clone the repo temporarily, load the Flow, and then delete it.
So this isn’t a mechanism that will work for importing functions from other files
m
M. Siddiqui
12/13/2021, 11:39 AM
Also stumbled upon this issue
I am bound to have many flows re-using many common tasks and functionalities
@William Clark@Kevin Kho
what do you recommend (for storage) in these scenarios ?
k
Kevin Kho
12/13/2021, 2:42 PM
Docker Storage and install the dependencies in the container. Or you can make a container and use DockerRun to run the Flow on top of a container with the dependencies
w
William Clark
12/13/2021, 8:34 PM
What @Kevin Kho suggested! What I ended up doing was creating a parent flow that uses GitPython and the Prefect Docker Tasks: https://docs.prefect.io/api/latest/tasks/docker.html#pushimage. This parent flow clones multiple repos that include different model scoring Flows and builds a base image. The dependencies such as a Dockerfile and the requirements.txt file are included in the cloned repositories. After the base image is built it is then pushed to an ECR repository where in sequential flows the ECR repository is passed in as the image_name parameter in the Docker Storage method. Then when the Flow is registered it is done with flow.register(name='Test, build=False) which came from this example https://github.com/anna-geller/packaging-prefect-flows/blob/master/flows_no_build/docker_script_docker_run_local_image.py by @Anna Geller.