Hey, I’m trying to start a docker agent on CentOS7...
# prefect-server
g
Hey, I’m trying to start a docker agent on CentOS7 (via prefect agent docker start), and get this error: “urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host=‘host.docker.internal’, port=4200): Max retries exceeded with url: /graphql (Caused by NewConnectionError(‘<urllib3.connection.HTTPConnection object at 0x7f2881d77350>: Failed to establish a new connection: [Errno 113] No route to host’))” I thought maybe it was this bug (https://github.com/PrefectHQ/server/issues/25) but I still can’t figure out how to work around it. Any suggestions on how to move forward? I’m stuck
n
Hi @Garret Cook - is this against Cloud or a local Server?
g
local server
n
Local to the agent as well, is that correct?
g
Correct
n
Great - can you confirm that you've run
prefect backend server
before attempting to start your agent?
g
Confirmed
I can get to the UI on 8080 also
n
Ok fantastic. From the same terminal can you run
curl <http://localhost:4200/graphql>
and see what's returned?
g
[root@prefect ~]# curl -v http://localhost:4200/graphql * About to connect() to localhost port 4200 (#0) * Trying ::1... * Connected to localhost (::1) port 4200 (#0)
GET /graphql HTTP/1.1
User-Agent: curl/7.29.0
Host: localhost:4200
Accept: /
< HTTP/1.1 400 Bad Request < X-Powered-By: Express < Access-Control-Allow-Origin: * < Content-Type: text/html; charset=utf-8 < Content-Length: 18 < ETag: W/“12-7JEJwpG8g89ii7CR/6hhfN27Q+k” < Date: Fri, 22 Jan 2021 170345 GMT < Connection: keep-alive < Keep-Alive: timeout=5 < * Connection #0 to host localhost left intact GET query missing.
n
Ok cool. And the docker host is accessible as well, correct?
g
[root@prefect ~]# docker run hello-world Unable to find image ‘hello-world:latest’ locally Trying to pull repository docker.io/library/hello-world ... latest: Pulling from docker.io/library/hello-world 0e03bdcc26d7: Pull complete Digest: sha256:31b9c7d48790f0d8c50ab433d9c3b7e17666d6993084c002c2ff1ca09b96391d Status: Downloaded newer image for docker.io/hello-world:latest Hello from Docker! This message shows that your installation appears to be working correctly. To generate this message, Docker took the following steps: 1. The Docker client contacted the Docker daemon. 2. The Docker daemon pulled the “hello-world” image from the Docker Hub. (amd64) 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading. 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal. To try something more ambitious, you can run an Ubuntu container with: $ docker run -it ubuntu bash Share images, automate workflows, and more with a free Docker ID: https://hub.docker.com/ For more examples and ideas, visit: https://docs.docker.com/get-started/
Yes, it responds readily
n
Ok great, that narrows down the issue significantly. Can you run this for me:
Copy code
echo "from prefect.utilities.docker_util import get_docker_ip\nprint(get_docker_ip())" > docker_ip.py && python docker_ip.py
Oops, hold off on that
Actually that should be fine
g
[root@prefect ~]# cat docker_ip.py
from prefect.utilities.docker_util import get_docker_ip
print(get_docker_ip())
[root@prefect ~]# python3.6 docker_ip.py
172.17.0.1
[root@prefect ~]#
I had to edit the command a little, but I think I got what you wanted
n
Ah yeah you got it (I'm not familiar with CentOS)
Ok I suspect this could be an issue with
firewalld
- you're able to communicate with the API over http from a variety of sources but not through the docker host. You'll need to allow container to container communication to enable this, I think, which you can do with
firewall-cmd --zone=public --add-masquerade --permanent && firewall-cmd --reload
There's a possibility you'll need to restart the docker service as well
g
Ok, shall I retry the docker agent now?
ok, I’ll reboot the whole system
👍 1
n
Any luck?
g
Progress I think
[root@prefect ~]# prefect agent docker start
____            __           _        _                    _
|  _ \ _ __ ___ / _| ___  ___| |_     / \   __ _  ___ _ __ | |_
| |_) | '__/ _ \ |_ / _ \/ __| __|   / _ \ / _
|/ _ \ '_ \| __|`
|  __/| | |  __/  _|  __/ (__| |_   / ___ \ (_| |  __/ | | | |_
|_|   |_|  \___|_|  \___|\___|\__| /_/   \_\__, |\___|_| |_|\__|
|___/
[2021-01-22 17:27:21,044] INFO - agent | Starting DockerAgent with labels []
[2021-01-22 17:27:21,045] INFO - agent | Agent documentation can be found at <https://docs.prefect.io/orchestration/>
[2021-01-22 17:27:21,045] INFO - agent | Agent connecting to the Prefect API at <http://localhost:4200>
[2021-01-22 17:27:21,079] INFO - agent | Waiting for flow runs...
[2021-01-22 17:27:21,316] INFO - agent | Found 1 flow run(s) to submit for execution.
[2021-01-22 17:27:21,513] INFO - agent | Deploying flow run 2d740d95-feb3-4ae9-b002-c4581595abf8
[2021-01-22 17:27:21,518] INFO - agent | Pulling image prefecthq/prefect:0.14.3...
[2021-01-22 17:27:24,245] INFO - agent | Successfully pulled image prefecthq/prefect:0.14.3...
[2021-01-22 17:28:08,008] INFO - agent | Found 1 flow run(s) to submit for execution.
[2021-01-22 17:28:08,175] INFO - agent | Deploying flow run 980ddf94-76c8-4e90-8219-59c80a3b30e5
[2021-01-22 17:28:08,181] INFO - agent | Pulling image prefecthq/prefect:0.14.3...
[2021-01-22 17:28:08,933] INFO - agent | Successfully pulled image prefecthq/prefect:0.14.3...
n
That looks good!
g
It deploys the flow without error, but they don’t seem to finish
n
Hm, that seems like a different issue then, could be code-related
Or perhaps infra-related
g
[root@prefect production-flows]# cat task.py
import prefect
from prefect.run_configs.docker import DockerRun
from prefect import task, Flow
@task
def say_hello():
logger = prefect.context.get("logger")
<http://logger.info|logger.info>("Hello, Cloud!")
with Flow("hello-flow") as flow:
say_hello()
flow.run_config = DockerRun(
#env={"SOME_VAR": "VALUE"},
#image="example/image-name:with-tag"
)
# builds the Docker image and pushes Flow metadata to the Prefect Cloud database
#flow.deploy(project_name="Backups")
flow.register(project_name="Backups")
n
Hm, any messages in the UI? Are the runs entering a
Running
state?
g
Just sitting in ‘submitted’
n
Oh you know what, it doesn't look like you're specifying an image to use with
DockerRun
- you'll either need to specify an image for that OR add docker storage to the flow:
Copy code
from prefect.storage import Docker

flow.storage = Docker()
The agent is probably submitting the flow to the docker daemon but the daemon doesn't know what to do with it since there's no image
g
Interesting
n
(that's my suspicion, anyway)
g
Def a part of the API I didn’t understand well
Why does it download the prefect docker image then? I thought it was going to try to run my commands inside that?
[2021-01-22 17:43:53,340] INFO - agent | Pulling image prefecthq/prefect:0.14.3...
^^What is it doing that for?
n
Good question, I'm not entirely sure why that's happening.
Let me know when you've added docker storage and how it works out. I'll ping the core team if that doesn't fix the issue
g
Ok, so I apologize, the original issue remains. It only shows up when debugging is turned on via:
prefect agent docker start --show-flow-logs
I had stopped using show-flow-logs which meant it didn’t show up
I added docker storage, with no change
n
Sorry when you say the original issue, do you mean that you can't start your agent any longer?
Or that the flows are still stuck in submitted?
g
The agent runs fine, but every flow stays in ‘submitted’ and the agent shows this error for every docker flow:
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='host.docker.internal', port=4200): Max retries exceeded with url: /graphql (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f0166b2a908>: Failed to establish a new connection: [Errno 113] No route to host',))
I tried adding the interface bridge to the firewalld trusted zone
and turned on masquerade
no effect
Ok, cooking with gas. Had to add docker0 to the trusted zone
br-8a166a45514d bridge also
now it can communicate
n
Oh interesting, so it needed even more access than we thought
So now the runs are being submitted correctly?
g
yep, perfectly, fast and snappy
n
Amazing, well done!
g
Well, I appreciate you tipping me off that it might be a firewall issue
That bug made me think it was a docker problem, I hadn’t looked past that
n
Of course, I updated the link above with what we discovered^ feel free to put in your 2-cents as well
(I'm not totally sure they're the same issue but it may help others that are in a similar spot)
g
yeah, that is one of the most popular google results, good idea