Hi there, I am running into a very odd issue with ...
# ask-community
m
Hi there, I am running into a very odd issue with regards to module packaging and registering to prefect cloud. The code lives here and the process to register lives here. Getting an
ModuleNotFoundError: No module named src
during the flow healthcheck, traceback here. Am I doing something wrong in terms of
__init__
packaging? This is a followup to thread yesterday.
n
It looks like when
src
is referenced, it's from within the
src
directory, shouldn't the
__init__
module reference it with
from flow import Flow
?
( i could be wrong here, that's just my initial thought)
m
i can try that! one sec
👍 1
z
I do not think that will resolve it
m
yeah because then my package locally will be messed up
z
From within the
src
init file you should still reference the full path to the module
m
I am as such:
Copy code
"""Top-level module."""
from src.flow import flow

__all__ = ["flow"]
u think its a naming issue? i can change
flow.py
to
build.py
or something
z
Did you see this warning?
Copy code
/opt/prefect/healthcheck.py:147: UserWarning: Flow uses module which is not importable. Refer to documentation on how to import custom modules <https://docs.prefect.io/api/latest/environments/storage.html#docker>
  flows = cloudpickle_deserialization_check(flow_file_paths)
The module that’s not importable is probably
src
which is not installed within the docker container
m
it is installed via `pip install -e .`… when i locally
import src
all works just fine
whihc is the same process i am following in CI workflow
z
pip install -e
is not run within the docker container though
Which is being used to store your flow
m
got it… so what kind of workaround is there?
i can include an additional step in the docker storage?
n
Oh, couldn't you copy the
src
folder to the docker container?
z
You can probably install your module using the
extra_dockerfile_commands
kwarg or include your module like so
Copy code
Docker(
    files={
        # absolute path source -> destination in image
        "/Users/me/code/mod1.py": "/modules/mod1.py",
        "/Users/me/code/mod2.py": "/modules/mod2.py",
    },
    env_vars={
        # append modules directory to PYTHONPATH
        "PYTHONPATH": "$PYTHONPATH:modules/"
    },
)
@nicholas it’ll need to be copied in and then installed or added to the python path
👍 1
@Mitchell Bregman there’s in example in the docker storage docs linked from that warning I pasted in
upvote 1
Python package management is a bit of a headache 😕
upvote 1
We have plans to write a blog post about it someday 🙂
👍 1
m
im about confused about what u suggested
so i should copy each file over?
z
So in the code block I pasted you are listing files that you’d like to pass into the docker image. You can actually just list the directory so
"/path/in/ci/to/module": "/modules"
m
got it - 1 sec
z
Will copy your module directory into the image. Then you need to either install it by running
pip install -e /modules/yourmodule
(via the extra cmds) or add it to the PYTHONPATH using
env_vars
m
didnt seem to like this
i think im doing something wrong
Copy code
extra_dockerfile_commands="pip install -e /modules",
    files={f"{os.path.join(os.path.expanduser('~'), 'project')}": "/modules"},
Copy code
flow.storage = Docker(
    env_vars=config.ENVIRONMENT_VARIABLES,
    extra_dockerfile_commands="pip install -e /modules",
    files={f"{os.path.join(os.path.expanduser('~'), 'project')}": "/modules"},
    image_name=config.DOCKER_IMAGE_NAME,
    image_tag=config.DOCKER_IMAGE_TAG,
    python_dependencies=config.PYTHON_DEPENDENCIES,
    registry_url="<http://parkmobile-docker.jfrog.io|parkmobile-docker.jfrog.io>",
    tls_config=tls_config,
)
this
os.path.join(os.path.expanduser('~'), 'project'
resolves to
home/circleci/project
(all the code if you were to clone it lives here)
so im moving this to
/modules
z
Seems reasonable, what was the error?
m
docker.errors.APIError: 400 Client Error: Bad Request ("Dockerfile parse error line 16: unknown instruction: P")
one sec - sending u full traceback
z
extra commands expects a list of strings
m
ahhhh
testing it out!
z
And it’s probably got to be in docker format so
["RUN …"]
m
roger that
im thinking this might be the one!! standby
thanks for all ur help!!
z
No problem! It’ll all be worth it for the write up in the end 😉
m
Has there been any documentation written on this issue? Or perhaps a better question, is there any alternative implementation in the roadmap to solve this more elegantly?
My flows predominantly consist of custom made tasks which themselves can be quite complex (relying on various custom functions, classes, etc). Do I correctly understand that if I want to use Docker storage I have to use the
files
argument as discussed in this thread or is there a better way?
z
I’m working on a guide to this and some more examples — as we get a feel for how people are using it we can introduce easier to use functionality directly in prefect.
Currently, I do something like this:
Copy code
from my_project import PROJECT_PATH, PROJECT_NAME
from prefect.storage.docker import Docker


def ProjectDockerStorage(
    project_path: str = PROJECT_PATH, project_name: str = PROJECT_NAME, **kwargs
) -> Docker:
    """
    A thin wrapper around `prefect.storage.Docker` with installation of a local project,
    defaulting to installing this project

    Cannot be a class because then it is not a known serializable storage type so this
    is just an instance factory for Docker storage
    """

    # Copy this namespace into the docker image
    kwargs.setdefault("files", {})
    kwargs["files"][str(project_path)] = project_name

    # Install the namespace so it's on the Python path
    kwargs.setdefault("extra_dockerfile_commands", [])
    kwargs["extra_dockerfile_commands"].append(f"RUN pip install -e {project_name}")

    return Docker(**kwargs)
then
Copy code
flow.storage = ProjectDockerStorage()
🙌 1
s
@Zanie Please can you share this code example in your github repo, I am facing similar issues