Thread
#prefect-community
    m

    Mitchell Bregman

    1 year ago
    Hi there, I am running into a very odd issue with regards to module packaging and registering to prefect cloud. The code lives here and the process to register lives here. Getting an
    ModuleNotFoundError: No module named src
    during the flow healthcheck, traceback here. Am I doing something wrong in terms of
    __init__
    packaging? This is a followup to thread yesterday.
    nicholas

    nicholas

    1 year ago
    It looks like when
    src
    is referenced, it's from within the
    src
    directory, shouldn't the
    __init__
    module reference it with
    from flow import Flow
    ?
    ( i could be wrong here, that's just my initial thought)
    m

    Mitchell Bregman

    1 year ago
    i can try that! one sec
    Michael Adkins

    Michael Adkins

    1 year ago
    I do not think that will resolve it
    m

    Mitchell Bregman

    1 year ago
    yeah because then my package locally will be messed up
    Michael Adkins

    Michael Adkins

    1 year ago
    From within the
    src
    init file you should still reference the full path to the module
    m

    Mitchell Bregman

    1 year ago
    I am as such:
    """Top-level module."""
    from src.flow import flow
    
    __all__ = ["flow"]
    u think its a naming issue? i can change
    flow.py
    to
    build.py
    or something
    Michael Adkins

    Michael Adkins

    1 year ago
    Did you see this warning?
    /opt/prefect/healthcheck.py:147: UserWarning: Flow uses module which is not importable. Refer to documentation on how to import custom modules <https://docs.prefect.io/api/latest/environments/storage.html#docker>
      flows = cloudpickle_deserialization_check(flow_file_paths)
    The module that’s not importable is probably
    src
    which is not installed within the docker container
    m

    Mitchell Bregman

    1 year ago
    it is installed via pip install -e .… when i locally
    import src
    all works just fine
    whihc is the same process i am following in CI workflow
    Michael Adkins

    Michael Adkins

    1 year ago
    pip install -e
    is not run within the docker container though
    Which is being used to store your flow
    m

    Mitchell Bregman

    1 year ago
    got it… so what kind of workaround is there?
    i can include an additional step in the docker storage?
    nicholas

    nicholas

    1 year ago
    Oh, couldn't you copy the
    src
    folder to the docker container?
    Michael Adkins

    Michael Adkins

    1 year ago
    You can probably install your module using the
    extra_dockerfile_commands
    kwarg or include your module like so
    Docker(
        files={
            # absolute path source -> destination in image
            "/Users/me/code/mod1.py": "/modules/mod1.py",
            "/Users/me/code/mod2.py": "/modules/mod2.py",
        },
        env_vars={
            # append modules directory to PYTHONPATH
            "PYTHONPATH": "$PYTHONPATH:modules/"
        },
    )
    @nicholas it’ll need to be copied in and then installed or added to the python path
    @Mitchell Bregman there’s in example in the docker storage docs linked from that warning I pasted in
    Python package management is a bit of a headache 😕
    We have plans to write a blog post about it someday 🙂
    m

    Mitchell Bregman

    1 year ago
    im about confused about what u suggested
    so i should copy each file over?
    Michael Adkins

    Michael Adkins

    1 year ago
    So in the code block I pasted you are listing files that you’d like to pass into the docker image. You can actually just list the directory so
    "/path/in/ci/to/module": "/modules"
    m

    Mitchell Bregman

    1 year ago
    got it - 1 sec
    Michael Adkins

    Michael Adkins

    1 year ago
    Will copy your module directory into the image. Then you need to either install it by running
    pip install -e /modules/yourmodule
    (via the extra cmds) or add it to the PYTHONPATH using
    env_vars
    m

    Mitchell Bregman

    1 year ago
    didnt seem to like this
    i think im doing something wrong
    extra_dockerfile_commands="pip install -e /modules",
        files={f"{os.path.join(os.path.expanduser('~'), 'project')}": "/modules"},
    flow.storage = Docker(
        env_vars=config.ENVIRONMENT_VARIABLES,
        extra_dockerfile_commands="pip install -e /modules",
        files={f"{os.path.join(os.path.expanduser('~'), 'project')}": "/modules"},
        image_name=config.DOCKER_IMAGE_NAME,
        image_tag=config.DOCKER_IMAGE_TAG,
        python_dependencies=config.PYTHON_DEPENDENCIES,
        registry_url="<http://parkmobile-docker.jfrog.io|parkmobile-docker.jfrog.io>",
        tls_config=tls_config,
    )
    this
    os.path.join(os.path.expanduser('~'), 'project'
    resolves to
    home/circleci/project
    (all the code if you were to clone it lives here)
    so im moving this to
    /modules
    Michael Adkins

    Michael Adkins

    1 year ago
    Seems reasonable, what was the error?
    m

    Mitchell Bregman

    1 year ago
    docker.errors.APIError: 400 Client Error: Bad Request ("Dockerfile parse error line 16: unknown instruction: P")
    one sec - sending u full traceback
    Michael Adkins

    Michael Adkins

    1 year ago
    extra commands expects a list of strings
    m

    Mitchell Bregman

    1 year ago
    ahhhh
    testing it out!
    Michael Adkins

    Michael Adkins

    1 year ago
    And it’s probably got to be in docker format so
    ["RUN …"]
    m

    Mitchell Bregman

    1 year ago
    roger that
    im thinking this might be the one!! standby
    thanks for all ur help!!
    Michael Adkins

    Michael Adkins

    1 year ago
    No problem! It’ll all be worth it for the write up in the end 😉
    m

    Miha Sajko

    1 year ago
    Has there been any documentation written on this issue? Or perhaps a better question, is there any alternative implementation in the roadmap to solve this more elegantly?
    My flows predominantly consist of custom made tasks which themselves can be quite complex (relying on various custom functions, classes, etc). Do I correctly understand that if I want to use Docker storage I have to use the
    files
    argument as discussed in this thread or is there a better way?
    Michael Adkins

    Michael Adkins

    1 year ago
    I’m working on a guide to this and some more examples — as we get a feel for how people are using it we can introduce easier to use functionality directly in prefect.
    Currently, I do something like this:
    from my_project import PROJECT_PATH, PROJECT_NAME
    from prefect.storage.docker import Docker
    
    
    def ProjectDockerStorage(
        project_path: str = PROJECT_PATH, project_name: str = PROJECT_NAME, **kwargs
    ) -> Docker:
        """
        A thin wrapper around `prefect.storage.Docker` with installation of a local project,
        defaulting to installing this project
    
        Cannot be a class because then it is not a known serializable storage type so this
        is just an instance factory for Docker storage
        """
    
        # Copy this namespace into the docker image
        kwargs.setdefault("files", {})
        kwargs["files"][str(project_path)] = project_name
    
        # Install the namespace so it's on the Python path
        kwargs.setdefault("extra_dockerfile_commands", [])
        kwargs["extra_dockerfile_commands"].append(f"RUN pip install -e {project_name}")
    
        return Docker(**kwargs)
    then
    flow.storage = ProjectDockerStorage()
    Sagun Garg

    Sagun Garg

    1 year ago
    @Michael Adkins Please can you share this code example in your github repo, I am facing similar issues