hi. I recently came across prefect and it looks am...
# prefect-community
w
hi. I recently came across prefect and it looks amazing. I am trying it out with a few of our jobs, but I'm stuck on something! (Sorry for the simple question!) When registering a flow with Bitbucket storage, how do I ensure that shared functions (from a project utilities module) are available to the flow when it runs on an agent (in this case a local agent)? In the project folder flows are in a flows folder, utilities in another folder, etc. I guess the flow is not being run from within the repo project directory because it can't find the utils and libs modules?
k
Hi @Walter Cavinaw, no worries on the question! All the modules need to be installed in the execution environment. Bitbucket Storage only keeps the flow definition (flow file). Only Docker storage keeps all the dependencies together so the recommendation is to install it as a Python module inside a Docker container. You can see this blog post if it helps making that container image.
w
thanks kevin. Your posts and video demos have been super helpful so far 👍
k
Oh thank you for watching 🙂
w
I read through your writeup and tried it out. That works well for me. thank you for sharing! I have a follow up question. I get why bitbucket storage doesn't work (it doesn't have the project modules installed), but does it still clone the whole repo? E.g you have another file (cofig, yaml,csv etc) in the project and want to read it in the flow. Could you still reference that file?
k
You can use git storage with additional files like this but this is meant for csv or yaml or sql. For Python files, it needs to be installed or added to the Python path.
w
Ok i see, git storage works differently than bitbucket storage (which is using the api to get a single file I presume). Using git storage, I guess if I wanted to do something real hacky, I could add the project directory at the start of each flow file?
Copy code
sys.path.append(str(Path(__file__).resolve().parent.parent))
Obviously this is not very robust, but I'll try it for this one case...
k
I don’t think it’ll work? We’ve had people try but I have yet to see anyone figure it out. The Python path manipulation is pretty hard
w
yes i see, that makes sense. We don't have cross project dependencies. The local agent is run on an image with all other dependencies (pip/conda) except for project modules. Using GitStorage and adding that line above to each flow file seems to do the trick for us. It adds our utils/libs modules and seems to work. (knock on wood). Just letting you know in case others might find it helpful as a quick and dirty hack. A simple view on our project structure: project -> /utils --> db_helpers.py --> data_helpers.py -> /flows --> model_flow_x.py --> model_flow_y.py
k
Oh I see. Nice work!