Reading the `Git` storage class source, am I corre...
# prefect-server
p
Reading the
Git
storage class source, am I correct to assume that it’s not possible to pull from a local repository?
k
Yes…but wouldn’t that basically be a Module? Maybe Module storage is an option?
p
k
Oh I see. Yes though Git storage is not meant for local repos.
p
Ok. Also I think the link I just posted is unrelated. Tried so many things today that I think I am 🧠 🍟
k
Maybe 😅. You could try it . The module has to be pip installed though for this one I think.
p
So I tried with
Module
but I get an
ImportError
. I am unclear if the
Module
storage is accessed by the context in which the Agent lives in, or inside the run_config (
Docker
) in my case.
In other words if I should make my
flows
package available to the Docker image of my run config, or just to my agent (running in a
pipenv
virtualenv in my case)
k
Relative to the Docker agent I think.
p
That’s also what I thought but no chance to get it working. When I simply open a python console at the same location from where I run the agent and do
import flows
it works though…
k
Do you show me the Dockerfile?
p
Yes, but now I am confused as why the Dockerfile would matter if the
Module
storage only matters for the agent?
Copy code
FROM python:3.7-slim-buster
WORKDIR /app
COPY . .
RUN pip3 install pipenv
RUN pipenv lock --keep-outdated --requirements > requirements.txt
RUN apt-get update
RUN apt-get -y install libpq-dev gcc
RUN pip3 install -r requirements.txt
k
Because the flow runs in the container when using Docker run so the Docker container needs the module installed. If using
Module
storage, you need the
setup.py
and
pip install -e .
to install the module as a library in the container.
p
Ah ok slowly getting it. So basically when using
Module
storage in conjuction with the
Docker
agent and run config you’re telling the agent: spin up that container, and then pull the flow from a module within that container?
k
Yes exactly. Same with
Local
storage. It’s pulling from the local storage relative to the Docker container
p
But then two things confuse me: • What’s the purpose of
DockerStorage
then? Isn’t that the exact same thing? • With this approach, you’d have to rebuild you Docker Image every time a flow changes, which I thought was the exact opposite of what
Module
is trying to achieve?
k
🤦‍♂️ My bad. You’re right. Pretty much. If you really need DockerRun and I guess it might as well be Docker Storage. I guess you need to host on Git somewhere?
p
(Also I thought that
Local
is incompatible with the
Docker
agent, according to this page)
Yes I have it hosted on Git. My whole issue is just that I don’t want to rely on a git repo for something that just sits next to the code I am running in my CI/CD
I am basically trying to come up with the third paragraph of this page. Thanks for the help anyway, I’ll try to come up with something.
k
LocalRun
is incompatible with
Docker
agent but
Local Storage
can be used with it.
🙌 1
p
But doc says : `Local Storage is the default 
Storage
 option for all flows. Flows using local storage are stored as files in the local filesystem. This means they can only be run by a local agent running on the same machine.`
k
Are you running the Flow on the same machine as the one you are developing in?
p
Euhm, to debug yes; but I am now trying to deploy it. I don’t have any Prefect Server running on my infrastructure for now - I am basically in the whole process of figuring out my best options.
I actually reread the
Module
doc and it’s pretty explicit:
you can use 
Module
 storage to reference and load them at execution time (provided the module is installed and importable in the execution environment)
So I guess I just have to make it available in the Docker image and it should all work 🙂
k
I see. Sorry for the messy thread, we don’t normally recommend
Local
storage with DockerRun so I think that docs is fine. I see, I was asking because if it’s all on the same box you could use
Local
everything. But yeah, I think you might need to use Docker. Where do you store the Docker Hub.
p
In Azure, but I guess I know have a clear idea of what my problem is 🙂
Oneliner in my `Dockerfile`:
ENV PYTHONPATH "${PYTHONPATH}:/app/flows"
😍
I have it exactly as I need now! All the dependencies and flows are nicely packaged in a single Docker Image now. I then use
Module
to just point to the correct module within the
flows
package. Execution is done with Docker Agent and Docker run using that image. It’s nice because everything can live in the same repo and the only thing my pipeline needs to do is build this simple Docker image and push it to the registry. I am very happy 🙂
Thanks again for the help
k
Nice! Glad you figured it out. That sounds like the best approach
z
Glad you got it worked out! It's a little tricky to explain mixing storage types with containerized environments 🙂 Feel free to share you setup at https://github.com/PrefectHQ/prefect/discussions/4042
p
Yes, it is difficult indeed and some things still remain a bit obscure to me. e.g. now I have a repo structured like this:
Copy code
myproject
- mypackage1
-- mymodule.py
- mypackage2
- flows
-- flow1.py
-- flow2.py
I use
Module
storage and
DockerRun
run config. This was raising an
ImportError
for the
flows
package, so I’ve added it to the
PYTHONPATH
of my Docker image. Now it all works, but I am wondering about one thing: in my flow files I import modules from e.g.
package1
and it simply works. So basically I am wondering how
Module
storage resolves all of this. • For the flows (
Module("flows.flow1")
) you need to add the package to the
PYTHONPATH
as if the mechanism was not inside
WORKINGDIR
of the Docker image, otherwise it won’t find the
flows
package (
ImportError
) • But when actually running flows that rely on importing packages that live in
WORKINGDIR
, the mechanism seems to be able to resolve them just fine as if executed from
WORKINGDIR
Don’t get me wrong, it all works now - I just found the documentation to not be very clear on this and I am wondering how it works exactly. Tried reading the source but it was a bit too involved/layered for me to understand within a few minutes.