Sen

    Sen

    6 months ago
    I have tried the suggestion from @Abhishek by following the GitHub project at https://github.com/kvnkho/demos/tree/main/prefect/docker_with_local_storage But it still doesn't work as expected. I believe I am missing some configuration.
    So I have started the docker agent in local machine and registered the flow into my prefect server. I can see the agent running. As mentioned in the readme, I went and initiated a run from the Prefect UI and the run got scheduled. But it doesn't get to the agent. The agent is still waiting for flows.
    Please find my code for the experiment at https://github.com/Navaneethsen/prefect_docker_experiment
    Anna Geller

    Anna Geller

    6 months ago
    Can you try with Prefect Cloud first? It's free to use and it would be significantly easier to debug and find the issue
    Sen

    Sen

    6 months ago
    sure.. I will try and let you know.
    Anna Geller

    Anna Geller

    6 months ago
    Also one thread above discusses pretty much the same issue 😄 related to difficulties running docker agent with Server on docker compose - maybe this thread can help as well And specifically related to your question: under flows this repo shows several examples of various storage and run configurations https://github.com/anna-geller/packaging-prefect-flows It seems that you want local storage (assuming your flow file is copied into the docker image and you provide a path within the container) and docker run, correct? If so, then try this flow example https://github.com/anna-geller/packaging-prefect-flows/blob/master/flows/local_script_docker_run_local_image.py
    Sen

    Sen

    6 months ago
    I was literally going through the last link you send. 😀
    By the way, in the prefect cloud the agent was triggered when I tried to do a Run form the Prefect UI. But it threw an error as shown below:
    Failed to load and execute Flow's environment: ModuleNotFoundError("No module named '/app/workflow/flow'")
    But why is this not getting triggered when I try this in my prefect server?
    The prefect server is still in 0.15.13, I haven't updated it as we have some prod pipelines running in it.. (I remember you said it is better to update the version to the latest.. 😕)
    @Anna Geller i also have got through your docker in docker post in discord.. But because you are here now, I just want to understand something.
    Assume that I have docker in docker setup, the new docker containers which gets started up from the main docker container running the agent, is the base image of the new docker containers same as the main docker agent?
    I mean if the job runs in the child containers which get spinned up from the main container, the main docker agent literally doesn't need any libraries specific to do the job right? Which means I need to install all the libraries in the flow docker image.. is that right? Or did I confuse you?
    Anna Geller

    Anna Geller

    6 months ago
    The image will be exactly as you specify it on your DockerRun run configuration. If you don't provide any image explicitly, Prefect will try to infer it from the execution environment so yes, I think it would pick up the one you set on the agent if you did provide one on the agent. If you haven't, then Prefect will check what Python and Prefect version are installed on the underlying machine and will use corresponding image e.g. if you have Python 3.8 and Prefect 0.15.13 then Prefect will pull prefecthq/prefect:0.15.13-python3.8
    Sen

    Sen

    6 months ago
    ok.. I understand that .. thanks
    So getting back to the previous issue:
    Failed to load and execute Flow's environment: ModuleNotFoundError("No module named '/app/workflow/flow'")
    This is my flow:
    import prefect
    from prefect import Flow, task
    from prefect.run_configs import DockerRun
    from prefect.storage import Local
    
    from components.componentA import ComponentA 
    from components.componentB import ComponentB
    
    @task
    def test_task():
        logger = prefect.context.get("logger")
        x = ComponentA(2)
        y = ComponentB(2)
        x = x.n + y.n
        <http://logger.info|logger.info>(f"Test {x}!")  # Should return 4
        return
    
    with Flow("docker_example", 
              storage=Local(path="/app/workflow/flow.py", stored_as_script=True, add_default_labels=False), 
              run_config=DockerRun(image="test:latest")) as flow:
        test_task()
    
    flow.register("SampleFlows", labels=["TestFlow"])
    Anna Geller

    Anna Geller

    6 months ago
    Exactly. So docker in docker is dangerous already from the storage perspective because it's not that docker pulls dependencies which are not already in the parent container, instead it duplicates everything and the layers between the parent and child image are not shared.
    Sen

    Sen

    6 months ago
    Ok. I figured out the issue with the error happening in cloud with my flow: I forgot to add this to the Dockerfile
    ENV PYTHONPATH="$PYTHONPATH:/app"
    I have updated my repo with the updates to get this setup working.https://github.com/Navaneethsen/prefect_docker_experiment Thanks for the pointer and your patience. 🙏
    Anna Geller

    Anna Geller

    6 months ago
    I love that you document and share your work, great job! 🙌