I have tried the suggestion from <@U02AX1NFZ0Q> by...
# prefect-community
s
I have tried the suggestion from @Abhishek by following the GitHub project at https://github.com/kvnkho/demos/tree/main/prefect/docker_with_local_storage But it still doesn't work as expected. I believe I am missing some configuration.
So I have started the docker agent in local machine and registered the flow into my prefect server. I can see the agent running. As mentioned in the readme, I went and initiated a run from the Prefect UI and the run got scheduled. But it doesn't get to the agent. The agent is still waiting for flows.
Please find my code for the experiment at https://github.com/Navaneethsen/prefect_docker_experiment
a
Can you try with Prefect Cloud first? It's free to use and it would be significantly easier to debug and find the issue
s
sure.. I will try and let you know.
a
Also one thread above discusses pretty much the same issue 😄 related to difficulties running docker agent with Server on docker compose - maybe this thread can help as well And specifically related to your question: under flows this repo shows several examples of various storage and run configurations https://github.com/anna-geller/packaging-prefect-flows It seems that you want local storage (assuming your flow file is copied into the docker image and you provide a path within the container) and docker run, correct? If so, then try this flow example https://github.com/anna-geller/packaging-prefect-flows/blob/master/flows/local_script_docker_run_local_image.py
s
I was literally going through the last link you send. 😀
🙌 1
By the way, in the prefect cloud the agent was triggered when I tried to do a Run form the Prefect UI. But it threw an error as shown below:
Copy code
Failed to load and execute Flow's environment: ModuleNotFoundError("No module named '/app/workflow/flow'")
But why is this not getting triggered when I try this in my prefect server?
The prefect server is still in 0.15.13, I haven't updated it as we have some prod pipelines running in it.. (I remember you said it is better to update the version to the latest.. 😕)
@Anna Geller i also have got through your docker in docker post in discord.. But because you are here now, I just want to understand something.
Assume that I have docker in docker setup, the new docker containers which gets started up from the main docker container running the agent, is the base image of the new docker containers same as the main docker agent?
I mean if the job runs in the child containers which get spinned up from the main container, the main docker agent literally doesn't need any libraries specific to do the job right? Which means I need to install all the libraries in the flow docker image.. is that right? Or did I confuse you?
a
The image will be exactly as you specify it on your DockerRun run configuration. If you don't provide any image explicitly, Prefect will try to infer it from the execution environment so yes, I think it would pick up the one you set on the agent if you did provide one on the agent. If you haven't, then Prefect will check what Python and Prefect version are installed on the underlying machine and will use corresponding image e.g. if you have Python 3.8 and Prefect 0.15.13 then Prefect will pull prefecthq/prefect:0.15.13-python3.8
s
ok.. I understand that .. thanks
So getting back to the previous issue:
Copy code
Failed to load and execute Flow's environment: ModuleNotFoundError("No module named '/app/workflow/flow'")
This is my flow:
Copy code
import prefect
from prefect import Flow, task
from prefect.run_configs import DockerRun
from prefect.storage import Local

from components.componentA import ComponentA 
from components.componentB import ComponentB

@task
def test_task():
    logger = prefect.context.get("logger")
    x = ComponentA(2)
    y = ComponentB(2)
    x = x.n + y.n
    <http://logger.info|logger.info>(f"Test {x}!")  # Should return 4
    return

with Flow("docker_example", 
          storage=Local(path="/app/workflow/flow.py", stored_as_script=True, add_default_labels=False), 
          run_config=DockerRun(image="test:latest")) as flow:
    test_task()

flow.register("SampleFlows", labels=["TestFlow"])
1
a
Exactly. So docker in docker is dangerous already from the storage perspective because it's not that docker pulls dependencies which are not already in the parent container, instead it duplicates everything and the layers between the parent and child image are not shared.
1
s
Ok. I figured out the issue with the error happening in cloud with my flow: I forgot to add this to the Dockerfile
ENV PYTHONPATH="$PYTHONPATH:/app"
I have updated my repo with the updates to get this setup working. https://github.com/Navaneethsen/prefect_docker_experiment Thanks for the pointer and your patience. 🙏
a
I love that you document and share your work, great job! 🙌