Jillian Kozyra

    Jillian Kozyra

    1 year ago
    Hey, i’m getting
    ModuleNotFoundError: No module named 'data_flows'
    when I try to register my flow locally using docker storage (specifically it’s failing the
    cloudpickle_deserialization_check
    healthcheck). we’re using fully qualified imports in our flows (e.g.
    from data_flows.utils.data_extraction tasks import thing
    ) to satisfy mypy, so i’m hoping there’s a way to get this working as well. i think the config matches the tutorial, and i’ve tried different values for the python path without any luck. Here’s our Docker setup:
    Docker(
        registry_url=ecr_registry_url,
        python_dependencies=python_dependencies,
        files={
            "/Users/jilliankozyra/projects/projectname/data_flows/utils/__init__.py": "/data_flows/utils/__init__.py",
            "/Users/jilliankozyra/projects/projectname/data_flows/utils/install_dependencies.py": "/data_flows/utils/install_dependencies.py",
            "/Users/jilliankozyra/projects/projectname/data_flows/utils/data_extraction_tasks.py": "/data_flows/utils/data_extraction_tasks.py"
        },
        env_vars={
            "PYTHONPATH": "$PYTHONPATH:data_flows/"
        },
    )
    Michael Adkins

    Michael Adkins

    1 year ago
    This likely isn't working because there's not an
    __init__.py
    being copied in for the top-level
    data_flows
    folder
    Have you considered just using
    pip install
    instead? It tends to be easier then setting a PYTHONPATH variable.
    Jillian Kozyra

    Jillian Kozyra

    1 year ago
    these are not pip-installable dependences. i’ve explicitly added a
    /data_flows/_init_.py
    but it does not appear to have made a difference.
    Michael Adkins

    Michael Adkins

    1 year ago
    You can just add a
    setup.py
    file then pip install the local module.
    Also note you can copy a whole directory with the
    files
    dict instead of using each file
    Have you tested the import locally instead of inside your docker image? It'll be faster to iterate that way.
    Jillian Kozyra

    Jillian Kozyra

    1 year ago
    yes, it works fine locally, it just fails in the docker build
    i’m only explicitly copying files for testing purposes - the original code that handles this (not written by me) copies everything in every directory, which makes iterating slow
    Michael Adkins

    Michael Adkins

    1 year ago
    What's your local PYTHONPATH look like?
    Jillian Kozyra

    Jillian Kozyra

    1 year ago
    export PYTHONPATH=.:${PYTHONPATH}:/Users/jilliankozyra/projects/projectname/
    data_flows
    is a subdirectory of that
    Michael Adkins

    Michael Adkins

    1 year ago
    Ah. have you tried adding just
    /
    to your pythonpath in the docker image?
    Jillian Kozyra

    Jillian Kozyra

    1 year ago
    i think that’s fixed it! thanks!