https://prefect.io logo
Title
j

John Mizerany

09/07/2022, 10:03 PM
We are trying to find a way to import custom modules alongside our flows. I found this article to see if there was a way to do this, and tried by using
sys.path.append
to include the module in our PYTHONPATH but that did not work. We are using Git Remote storage but it seems the agent we are using is not able to pick up on the custom files/modules we wrote in the subdirectory (we are still using prefect cloud 1.0 and the UI when we create a run gives us
Failed to load and execute flow run: ModuleNotFoundError
)
n

Nate

09/07/2022, 10:16 PM
@John Mizerany what kind of executor (i.e. flow runtime infrastructure) are you intending to use?
j

John Mizerany

09/07/2022, 10:27 PM
Hey @Nate, We currently have an agent running in an EC2 instance
n

Nate

09/07/2022, 10:49 PM
This error seems like it'd be because it's pulling your flow script at flow run submission time from your GitHub storage but the modules it references aren't available to the EC2 running your agent, since only the file specified in the GitHub storage object is pulled. You could include your modules in the flow script if they're small enough, or otherwise I'd suggest running docker on your EC2 and building your modules into an image that you can reference in a DockerRun run config
j

John Mizerany

09/07/2022, 11:07 PM
Thanks! We have been using the GitHub integration but for this case we switched to Git instead because we saw you are able to pull in other files besides the flow file. Is that still possible with this?
n

Nate

09/07/2022, 11:15 PM
ahh yes I haven't tried this personally but that should work with Git storage
j

John Mizerany

09/07/2022, 11:18 PM
Okay gotcha. That is what I currently have implemented but I am still getting the Module Not Found error so I am not sure what could be going wrong if Git storage allows multiple files to be loaded?
n

Nate

09/07/2022, 11:27 PM
Without looking at the flow code itself I'm not sure I could guess, could you show how you're altering the system path / referring to your modules in your code?
j

John Mizerany

09/07/2022, 11:33 PM
Essentially here is where I am importing my modules in the flow file:
import sys
sys.path.append("./helpers")
import helpers.aws as aws
import helpers.helpers as helpers
import helpers.snowflake_helpers as snowflake_helpers
And the folder structure looks like this
folder/
|_ flow_example.py
|_ helpers/
    |_ __init__.py
    |_ helpers.py
    |_ aws.py
    |_ snowflake_helpers.py
So it seems that it is messing up on that first helpers.aws line where I am importing
Hey @Nate! I wanted to see if there was any update on if this set up I have was possible with the Git Storage integration
n

Nate

09/08/2022, 3:47 PM
Hi @John Mizerany I don't think that
sys.path.append("./helpers")
will work, since this code will run at registration time and it seems you'd be appending your local path to your EC2's sys path I would try something more like the example you linked, where you get the
file_path = Path(__file__).resolve().parent
which will update according to where it was loaded from, and then you should be* able to add files from that
file_path
to your system path (* I'm not positive whether this works. I'm currently on vacation so I would have to try and reproduce myself once I'm back - feel free to send a message from this thread to the channel if you're still stuck - but again you can always install your helpers where you run the agent or use docker to include your deps via Dockerfile for a DockerRun)
j

John Mizerany

09/08/2022, 3:48 PM
Okay thank you! I can try this out. Let me know if you can replicate this and enjoy your vacation!
๐Ÿ‘ 1
Okay good news I figured it out partially. Since the EC2 agent is looking for a module imported on the machine it would always break. So I am using importlib to treat the file as a path and that worked ๐ŸŽ‰
I do have some json files that are still giving me trouble but I think I can figure that out since itโ€™s the same logic!