https://prefect.io logo
Title
r

Ritabrata Moitra

05/11/2023, 6:14 AM
Hey folks. Two questions on running flows on Docker - 1. The deployment guide mentions
You'll need Docker Engine installed and running on the same machine as your agent
My agent runs on EKS using the official helm chart, but fails with
RuntimeError: Could not connect to Docker
Do I need to install DockerEngine separately on the agent? 2. Can someone point me towards resources on running flows embedded in ECR images through ECS? ( My agent would continue to run on EKS though ) Any help would be largely appreciated 🙏
n

Nate

05/11/2023, 2:41 PM
hi @Ritabrata Moitra 1. if you're running the helm chart on your EKS cluster, you should not need to otherwise install the Docker Engine. can you see that your agent is healthy in the Prefect UI after installing the helm chart for the agent? 2. i know there are some more resources on ECS in the works, but here's info on using the `ECSTask` infra block. So if you're using the infra block, you'd just have to set the appropriate
image
uri on your infra block and then use that in your deployment Worth noting that ECS workers are a thing now, workers being (newer) strongly-typed agents that we will recommend going forward, as they are designed to work nicely with work pools (which are also typed). And then you can use projects to define a deployment that overrides the
image
in the ECS job configuration via ECSVariables
r

Ritabrata Moitra

05/14/2023, 4:39 PM
Thanks a ton for the prompt response @Nate. Will give ECS workers a shot! 😄
All of my flow runs continue to fail with the same
RuntimeError: Could not connect to Docker.
error. The helm agents have a similar output -
04:34:46.199 | INFO    | prefect.agent - Submitting flow run 'ec2bec70-7375-4664-a3e7-9e1a98f6ee94'
04:34:46.400 | ERROR   | prefect.agent - Failed to submit flow run 'ec2bec70-7375-4664-a3e7-9e1a98f6ee94' to infrastructure.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 398, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.10/http/client.py", line 1283, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.10/http/client.py", line 1329, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.10/http/client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.10/http/client.py", line 1038, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.10/http/client.py", line 976, in send
    self.connect()
  File "/usr/local/lib/python3.10/site-packages/docker/transport/unixconn.py", line 30, in connect
    sock.connect(self.unix_socket)
FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.10/site-packages/urllib3/util/retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/local/lib/python3.10/site-packages/urllib3/packages/six.py", line 769, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 398, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.10/http/client.py", line 1283, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.10/http/client.py", line 1329, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.10/http/client.py", line 1278, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.10/http/client.py", line 1038, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.10/http/client.py", line 976, in send
    self.connect()
  File "/usr/local/lib/python3.10/site-packages/docker/transport/unixconn.py", line 30, in connect
    sock.connect(self.unix_socket)
urllib3.exceptions.ProtocolError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/docker/api/client.py", line 214, in _retrieve_server_version
    return self.version(api_version=False)["ApiVersion"]
  File "/usr/local/lib/python3.10/site-packages/docker/api/daemon.py", line 181, in version
    return self._result(self._get(url), json=True)
  File "/usr/local/lib/python3.10/site-packages/docker/utils/decorators.py", line 46, in inner
    return f(self, *args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/docker/api/client.py", line 237, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 600, in get
    return self.request("GET", url, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 547, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/docker.py", line 650, in _get_client
    docker_client = docker.from_env()
  File "/usr/local/lib/python3.10/site-packages/docker/client.py", line 96, in from_env
    return cls(
  File "/usr/local/lib/python3.10/site-packages/docker/client.py", line 45, in __init__
    self.api = APIClient(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/docker/api/client.py", line 197, in __init__
    self._version = self._retrieve_server_version()
  File "/usr/local/lib/python3.10/site-packages/docker/api/client.py", line 221, in _retrieve_server_version
    raise DockerException(
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/agent.py", line 490, in _submit_run_and_capture_errors
    result = await infrastructure.run(task_status=task_status)
  File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/docker.py", line 322, in run
    container = await run_sync_in_worker_thread(self._create_and_start_container)
  File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/docker.py", line 425, in _create_and_start_container
    docker_client = self._get_client()
  File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/docker.py", line 653, in _get_client
    raise RuntimeError("Could not connect to Docker.") from exc
RuntimeError: Could not connect to Docker.
04:34:46.411 | INFO    | prefect.agent - Completed submission of flow run 'ec2bec70-7375-4664-a3e7-9e1a98f6ee94'
My docker-container infrastructure block exactly resembles the configuration provided here - https://docs.prefect.io/latest/guides/deployment/docker/ Could you folks give any hints on what I might be doing wrong? Would be grateful for any help 🙏
c

Christopher Boyd

05/15/2023, 7:57 PM
is this running docker locally on your machine, or on another system?
n

Nate

05/15/2023, 8:06 PM
ahh I think the issue is that you're using a docker-container infrastructure for your deployment
c

Christopher Boyd

05/15/2023, 8:07 PM
Is the intent to run in Docker specifically, or is it to run in EKS? These are generally two different things - if you’re running in EKS, you would want to submit your jobs locally to EKS
if you would like to run a docker agent, then that would typically be on a VM or locally, with docker installed and running
j

Jafar A

05/15/2023, 8:58 PM
I had same issue today as well. I reinstalled docker desktop and got the Apple chip version and working great now.
I guess not! funny it ran a few jobs and then started giving same error again!!
r

Ritabrata Moitra

05/16/2023, 11:13 AM
Ahhh! I think I might have misunderstood the
docker-container
infrastructure terminology. What I basically want is to build a docker image from my code, and then execute it anywhere ( EKS / ECS … ). My understanding was that this is what the
docker-container
infrastructure block facilitates, but looks like I was mistaken. I am generally a bit averse about running flows directly on EKS ( or ECS for that matter ) because of the dependency issues. In our setup, we containerise all our workloads and then run the same on different infrastructure components ( EC2 / Fargate / Lambda etc. ), and I was trying to replicate the same. How do you folks typically go about this?
c

Christopher Boyd

05/16/2023, 12:18 PM
i’m not sure by what you mean dependency issue? If the code is in a docker image, then there shouldn’t be a dependency issue as it’s baked into the container?
AWS (and other clouds) provide plenty of tools, so you can certainly do the same thing in different ways to similar effect, but without knowing your particular case, EKS running flows is a very very common implementation