If I want to have an external python file (e.g. in...
# prefect-community
r
If I want to have an external python file (e.g. in a
src/
directory) what is the best way to import it? I tried following similar logic to this: https://docs.prefect.io/orchestration/flow_config/storage.html#loading-additional-files-with-git-storage but adding to the import path:
Copy code
import pathlib, sys
file_path = pathlib.Path(__file__).resolve().parent
sys.path.append(file_path)
But keep getting this error:
[23 February 2022 4:22pm]: Failed to load and execute Flow's environment: ModuleNotFoundError("No module named 'src'")
Is there a best practice for importing external python code into a flow?
k
Are you using the Local agent?
You can specify the working dir like the last example here
r
No I am using Git (preferably GitLab) for storage and ECSRun as my config. My flow looks like:
Copy code
# general prefect imports
import prefect
from prefect import task, Flow
from prefect.storage import Git
from prefect.run_configs import ECSRun
from prefect.client import Secret

# specific imports to load files from src/
import pathlib, sys
file_path = pathlib.Path(__file__).resolve().parent
sys.path.append(file_path)

from src.seasonality_index_builder_dynamic_agg import run_seasonality_index_builder_dynamic_agg

# define a wrapper task to expose logging
@task(log_stdout=True, checkpoint=False)
def run_script():
    logger = prefect.context.get("logger")
    <http://logger.info|logger.info>("Running script...")
    run_seasonality_index_builder_dynamic_agg()

# instantiate the flow - we store the flow definition in gitlab
with Flow("seasonality_index_builder",
        storage=Git(
            [git info]
            ),
        run_config=ECSRun(
            [ECS stuff]
            )
         ) as flow:
    run_script()

# Register the flow under the "tutorial" project
flow.register(project_name="Testing",
        labels=['ds']
        )
k
Ah yeah in this case it really needs to go into the container for ECSRun. Git storage is not intended to handle other Python files, just stuff like sql and yaml. The Path manipulation is pretty hard and might be impossible. Of course, if you find a solution please share so we can archive.
r
Okay - is something like this pretty typical then:
COPY src /home/mambauser/src
in the dockerfile, then
Copy code
import pathlib, sys, os
sys.path.append(pathlib.Path(os.environ["HOME"]).resolve())
in the flow. In this case
os.environ["HOME"]
should resolve to
/home/mambauser
k
Not really because at this point you may as well install that
src
as a Python package so it’s accessible wherever the Flow runs. Are you familiar with how to do that?
r
As in break out the contents of
src
into a module then install via pip within my docker container?
k
Yes, but I don’t know what you mean by “break out”. I think you just need to provide a
setup.py
?
If ever it helps you, here is a blog for that
r
Poor phrasing on my part - that blog post will be a good starting point. Thanks a ton for the quick response and feedback. Really enjoying prefect so far!
👍 1
k
Of course! 🙂