https://prefect.io logo
m

Mitchell Bregman

10/23/2020, 6:09 PM
Hi there, I am running into a very odd issue with regards to module packaging and registering to prefect cloud. The code lives here and the process to register lives here. Getting an
ModuleNotFoundError: No module named src
during the flow healthcheck, traceback here. Am I doing something wrong in terms of
__init__
packaging? This is a followup to thread yesterday.
n

nicholas

10/23/2020, 6:13 PM
It looks like when
src
is referenced, it's from within the
src
directory, shouldn't the
__init__
module reference it with
from flow import Flow
?
( i could be wrong here, that's just my initial thought)
m

Mitchell Bregman

10/23/2020, 6:14 PM
i can try that! one sec
👍 1
z

Zanie

10/23/2020, 6:15 PM
I do not think that will resolve it
m

Mitchell Bregman

10/23/2020, 6:15 PM
yeah because then my package locally will be messed up
z

Zanie

10/23/2020, 6:16 PM
From within the
src
init file you should still reference the full path to the module
m

Mitchell Bregman

10/23/2020, 6:16 PM
I am as such:
Copy code
"""Top-level module."""
from src.flow import flow

__all__ = ["flow"]
u think its a naming issue? i can change
flow.py
to
build.py
or something
z

Zanie

10/23/2020, 6:17 PM
Did you see this warning?
Copy code
/opt/prefect/healthcheck.py:147: UserWarning: Flow uses module which is not importable. Refer to documentation on how to import custom modules <https://docs.prefect.io/api/latest/environments/storage.html#docker>
  flows = cloudpickle_deserialization_check(flow_file_paths)
The module that’s not importable is probably
src
which is not installed within the docker container
m

Mitchell Bregman

10/23/2020, 6:18 PM
it is installed via `pip install -e .`… when i locally
import src
all works just fine
whihc is the same process i am following in CI workflow
z

Zanie

10/23/2020, 6:18 PM
pip install -e
is not run within the docker container though
Which is being used to store your flow
m

Mitchell Bregman

10/23/2020, 6:19 PM
got it… so what kind of workaround is there?
i can include an additional step in the docker storage?
n

nicholas

10/23/2020, 6:20 PM
Oh, couldn't you copy the
src
folder to the docker container?
z

Zanie

10/23/2020, 6:20 PM
You can probably install your module using the
extra_dockerfile_commands
kwarg or include your module like so
Copy code
Docker(
    files={
        # absolute path source -> destination in image
        "/Users/me/code/mod1.py": "/modules/mod1.py",
        "/Users/me/code/mod2.py": "/modules/mod2.py",
    },
    env_vars={
        # append modules directory to PYTHONPATH
        "PYTHONPATH": "$PYTHONPATH:modules/"
    },
)
@nicholas it’ll need to be copied in and then installed or added to the python path
👍 1
@Mitchell Bregman there’s in example in the docker storage docs linked from that warning I pasted in
upvote 1
Python package management is a bit of a headache 😕
upvote 1
We have plans to write a blog post about it someday 🙂
👍 1
m

Mitchell Bregman

10/23/2020, 6:22 PM
im about confused about what u suggested
so i should copy each file over?
z

Zanie

10/23/2020, 6:23 PM
So in the code block I pasted you are listing files that you’d like to pass into the docker image. You can actually just list the directory so
"/path/in/ci/to/module": "/modules"
m

Mitchell Bregman

10/23/2020, 6:24 PM
got it - 1 sec
z

Zanie

10/23/2020, 6:24 PM
Will copy your module directory into the image. Then you need to either install it by running
pip install -e /modules/yourmodule
(via the extra cmds) or add it to the PYTHONPATH using
env_vars
m

Mitchell Bregman

10/23/2020, 6:38 PM
didnt seem to like this
i think im doing something wrong
Copy code
extra_dockerfile_commands="pip install -e /modules",
    files={f"{os.path.join(os.path.expanduser('~'), 'project')}": "/modules"},
Copy code
flow.storage = Docker(
    env_vars=config.ENVIRONMENT_VARIABLES,
    extra_dockerfile_commands="pip install -e /modules",
    files={f"{os.path.join(os.path.expanduser('~'), 'project')}": "/modules"},
    image_name=config.DOCKER_IMAGE_NAME,
    image_tag=config.DOCKER_IMAGE_TAG,
    python_dependencies=config.PYTHON_DEPENDENCIES,
    registry_url="<http://parkmobile-docker.jfrog.io|parkmobile-docker.jfrog.io>",
    tls_config=tls_config,
)
this
os.path.join(os.path.expanduser('~'), 'project'
resolves to
home/circleci/project
(all the code if you were to clone it lives here)
so im moving this to
/modules
z

Zanie

10/23/2020, 6:40 PM
Seems reasonable, what was the error?
m

Mitchell Bregman

10/23/2020, 6:40 PM
docker.errors.APIError: 400 Client Error: Bad Request ("Dockerfile parse error line 16: unknown instruction: P")
one sec - sending u full traceback
z

Zanie

10/23/2020, 6:41 PM
extra commands expects a list of strings
m

Mitchell Bregman

10/23/2020, 6:41 PM
ahhhh
testing it out!
z

Zanie

10/23/2020, 6:41 PM
And it’s probably got to be in docker format so
["RUN …"]
m

Mitchell Bregman

10/23/2020, 6:41 PM
roger that
im thinking this might be the one!! standby
thanks for all ur help!!
z

Zanie

10/23/2020, 6:53 PM
No problem! It’ll all be worth it for the write up in the end 😉
m

Miha Sajko

12/19/2020, 4:56 PM
Has there been any documentation written on this issue? Or perhaps a better question, is there any alternative implementation in the roadmap to solve this more elegantly?
My flows predominantly consist of custom made tasks which themselves can be quite complex (relying on various custom functions, classes, etc). Do I correctly understand that if I want to use Docker storage I have to use the
files
argument as discussed in this thread or is there a better way?
z

Zanie

12/19/2020, 6:14 PM
I’m working on a guide to this and some more examples — as we get a feel for how people are using it we can introduce easier to use functionality directly in prefect.
Currently, I do something like this:
Copy code
from my_project import PROJECT_PATH, PROJECT_NAME
from prefect.storage.docker import Docker


def ProjectDockerStorage(
    project_path: str = PROJECT_PATH, project_name: str = PROJECT_NAME, **kwargs
) -> Docker:
    """
    A thin wrapper around `prefect.storage.Docker` with installation of a local project,
    defaulting to installing this project

    Cannot be a class because then it is not a known serializable storage type so this
    is just an instance factory for Docker storage
    """

    # Copy this namespace into the docker image
    kwargs.setdefault("files", {})
    kwargs["files"][str(project_path)] = project_name

    # Install the namespace so it's on the Python path
    kwargs.setdefault("extra_dockerfile_commands", [])
    kwargs["extra_dockerfile_commands"].append(f"RUN pip install -e {project_name}")

    return Docker(**kwargs)
then
Copy code
flow.storage = ProjectDockerStorage()
🙌 1
s

Sagun Garg

12/28/2020, 8:21 AM
@Zanie Please can you share this code example in your github repo, I am facing similar issues