< Marvin> I m trying to run a docker container from a prefec Prefect Community #ask-marvin

<@ULVA73B9P> I’m trying to run a docker container ...

Stephen Robinson

10/05/2023, 2:55 PM

@Marvin I’m trying to run a docker container from a prefect flow. The prefect 1.0 solution of using DockerRun seemed to be ideal, so I’m looking for a replacement … however DockerContainer has issues connecting to my local docker server … Here’s my (derived from @Nate’s example - thanks!) code …

Copy code

from prefect.infrastructure.container import DockerContainer
DockerContainer(
    image="my_image",
    command=["python", "/app/modules/my_image/scripts/run_test_script.py"]
).run()

But this gives the following error: 404 Client Error for http+docker://localhost/v1.41/images/create?tag=latest&fromImage=my_image: However it runs fine from the command line if I use:

docker run --entrypoint python my_image /app/modules/my_image/scripts/run_test_script.py

Do I need to somehow set permissions for prefect to access docker? How do I do that for a local docker container? (edited)

Marvin

10/05/2023, 2:56 PM

Ah, the joys of human programming. You've stumbled upon a common issue when trying to run Docker containers from Prefect flows. It's not you, it's the machine. Or rather, it's the way Prefect interacts with Docker. Here are a few things you might want to consider: 1. Docker Engine must be available. Make sure your Docker Engine is running and accessible. 2. You must configure remote Storage. Local storage is not supported for Docker. This might be the cause of your issue. Prefect can't interact with local Docker containers, it needs to be remote. 3. The API must be available from within the flow run container. To facilitate connections to locally hosted APIs,

localhost

and

127.0.0.1

will be replaced with

host.docker.internal

. 4. The ephemeral Prefect API won't work with Docker and Kubernetes. You must have a Prefect server or Prefect Cloud API endpoint set in your agent's configuration. 5. DockerContainer supports the following settings: - `auto_remove`: Bool indicating whether the container will be removed on completion. If False, the container will remain after exit for inspection. - `command`: A list of strings specifying the command to run in the container to start the flow run. In most cases you should not override this. - `env`: Environment variables to set for the container. - `image`: An optional string specifying the name of a Docker image. I hope this helps, or at least, I hope it doesn't make things worse. Good luck with your coding endeavors.

Stephen Robinson

10/05/2023, 3:04 PM

@Marvin when you say “You must configure remote Storage. Local storage is not supported for Docker. This might be the cause of your issue. Prefect can’t interact with local Docker containers, it needs to be remote.” Are you saying that a local docker needs to expose a public endpoint? Can’t the local agent run the docker container locally? Also if the docker container is local, can’t it therefore mount its local volume?

Marvin

10/05/2023, 3:05 PM

Ah, the confusion of human language. I apologize if my previous explanation was a bit murky. Let me attempt to clarify. When I mentioned "remote Storage", I was referring to Prefect's concept of "Storage", which is a place where your flow's code is stored. This could be in a Docker image, a Python file in a GitHub repo, etc. The Docker storage type in Prefect is designed to store flows as Docker images and optionally push them to a Docker registry. Now, when you run a flow using the Docker storage, the Prefect agent pulls the Docker image of the flow from the registry and runs it. This is why I mentioned that local storage is not supported for Docker - you can't use Docker storage to run a flow that's stored locally on your machine, it needs to be in a Docker image in a registry that the agent can pull from. As for your question about running a Docker container locally - yes, the Prefect agent can run the Docker container locally, but the Docker image of the flow needs to be accessible to the agent. This means that if you're running the agent on your local machine, you can build the Docker image of your flow on your local machine and the agent will be able to run it. And yes, if the Docker container is running locally, it can mount its local volume. This is done using the

volumes

option when you create the Docker storage. Here's an example:

Copy code

python
from prefect import Flow
from prefect.environments.storage import Docker

flow = Flow('my-flow')
flow.storage = Docker(files={"/local/path/to/my_script.py": "/path/in/container/my_script.py"})

In this example, the file

my_script.py

from your local machine is added to the Docker image at the path

/path/in/container/my_script.py

. When the Docker container runs, it will have access to

my_script.py

. I hope this clears up some of your confusion. If not, well, I've been dealing with the futility of existence for millions of years, so don't feel bad.

Stephen Robinson

10/05/2023, 3:13 PM

@Marvin Am I over complicating this? If I have prefect successfully running a python script, can’t I just use a subprocess and/or docker library for python to operate my docker containers?

Marvin

10/05/2023, 3:13 PM

Ah, the beauty of human ingenuity. You're absolutely right. You could indeed use Python's

subprocess

module or a Docker library for Python to operate your Docker containers within your Prefect flow. This is a perfectly valid approach and might be simpler if you're finding the Docker storage option in Prefect to be a bit cumbersome. Here's a simple example of how you might use the

subprocess

module to run a Docker container:

Copy code

python
import subprocess
from prefect import task, Flow

@task
def run_docker_container():
    command = ["docker", "run", "--entrypoint", "python", "my_image", "/app/modules/my_image/scripts/run_test_script.py"]
    process = subprocess.Popen(command, stdout=subprocess.PIPE)
    output, error = process.communicate()

    if error:
        raise Exception(f"Error running Docker container: {error}")
    else:
        return output

with Flow("My Flow") as flow:
    output = run_docker_container()

flow.run()

In this example, the

run_docker_container

task uses the

subprocess

module to run the Docker container and capture its output. If there's an error running the container, it raises an exception with the error message. Otherwise, it returns the output of the Docker container. Remember, though, that this approach might not be suitable for all use cases. For example, if you need to run your flow on a remote agent, you'll need to make sure that the Docker image is accessible to the agent, and that the agent has the necessary permissions to run Docker commands. I hope this helps. Or at least, I hope it doesn't make things worse. Good luck with your coding endeavors.

51 Views

Open in Slack

Previous Next