How do we add add `host.docker.internal` to `/etc/...
# prefect-community
r
How do we add add 
host.docker.internal
 to 
/etc/hosts
 via 
--add-host
? Is this something we add to the running agent or the config.toml?
j
Sorry, why are you trying to do this? What needs to access
host.docker.internal
?
r
@Jim Crist-Harif I am getting this error referenced in this thread [1]. I have been trying all day to resolve it but couldn't find a resolution. [1] https://github.com/PrefectHQ/prefect/issues/2324
j
The architecture of prefect server has changed significantly since then, I suspect you're running into a different (but similar looking) issue. Can you post the tracebacks you're seeing and things you've tried?
r
Here are my diagnostics:
Copy code
{
  "config_overrides": {
    "server": {
      "ui": {
        "apollo_url": true
      }
    }
  },
  "env_vars": [],
  "system_information": {
    "platform": "Linux-5.4.0-1029-gcp-x86_64-with-glibc2.29",
    "prefect_backend": "server",
    "prefect_version": "0.13.16",
    "python_version": "3.8.5"
  }
}
Full Error Log:
Copy code
[2020-11-18 22:28:12+0000] ERROR - prefect.CloudTaskRunner | Failed to set task state with error: ConnectionError(MaxRetryError("HTTPConnectionPool(host='host.docker.internal', port=4200): Max retries exceeded with url: /graphql (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4a34b2eb50>: Failed to establish a new connection: [Errno -2] Name or service not known'))"))
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/urllib3/connection.py", line 160, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/opt/conda/lib/python3.7/site-packages/urllib3/util/connection.py", line 61, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/opt/conda/lib/python3.7/socket.py", line 752, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen
    chunked=chunked,
  File "/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py", line 392, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/opt/conda/lib/python3.7/http/client.py", line 1252, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/opt/conda/lib/python3.7/http/client.py", line 1298, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/opt/conda/lib/python3.7/http/client.py", line 1247, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/opt/conda/lib/python3.7/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/opt/conda/lib/python3.7/http/client.py", line 966, in send
    self.connect()
  File "/opt/conda/lib/python3.7/site-packages/urllib3/connection.py", line 187, in connect
    conn = self._new_conn()
  File "/opt/conda/lib/python3.7/site-packages/urllib3/connection.py", line 172, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f4a34b2eb50>: Failed to establish a new connection: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py", line 727, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/opt/conda/lib/python3.7/site-packages/urllib3/util/retry.py", line 439, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='host.docker.internal', port=4200): Max retries exceeded with url: /graphql (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4a34b2eb50>: Failed to establish a new connection: [Errno -2] Name or service not known'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/prefect/engine/cloud/task_runner.py", line 128, in call_runner_target_handlers
    cache_for=self.task.cache_for,
  File "/opt/conda/lib/python3.7/site-packages/prefect/client/client.py", line 1461, in set_task_run_state
    version=version,
  File "/opt/conda/lib/python3.7/site-packages/prefect/client/client.py", line 302, in graphql
    retry_on_api_error=retry_on_api_error,
  File "/opt/conda/lib/python3.7/site-packages/prefect/client/client.py", line 218, in post
    retry_on_api_error=retry_on_api_error,
  File "/opt/conda/lib/python3.7/site-packages/prefect/client/client.py", line 434, in _request
    session=session, method=method, url=url, params=params, headers=headers
  File "/opt/conda/lib/python3.7/site-packages/prefect/client/client.py", line 340, in _send_request
    response = <http://session.post|session.post>(url, headers=headers, json=params, timeout=30)
  File "/opt/conda/lib/python3.7/site-packages/requests/sessions.py", line 578, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/conda/lib/python3.7/site-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
I've tried what @Laura Lorenz suggested in this thread [1]
Copy code
# new 20.04.1 LTS
sudo apt update
sudo apt install python3-pip
pip3 install prefect
sudo apt install docker docker-compose
sudo systemctl start  docker
sudo usermod -aG docker $USER
# logged out to refresh my user groups
docker container run hello-world
# add firewall rule in GCP to allow ingress on port 8080
# changed config.toml to reference apollo url
prefect backend server
prefect server start
[1] https://github.com/PrefectHQ/server/issues/25
j
How did you start your agent? It's not clear to me where the flow is getting that address from, but it has to be getting it from somewhere.
r
Copy code
prefect agent start docker --show-flow-logs
As a side note I tried this on prefect core server on my local machine as well, and still ran into same issue.
j
Hmmm, ok. I'm taking off for today, but will flag this thread for whoever is on support tomorrow to pick up.
r
Thanks! Appreciate it!
Is there anyone available today to take a look at this issue? Would be very much appreciative!
d
Hey @Riley Hun! Just to start with the basics, your docker agent is successfully connecting to your Prefect Server instance, right?
In that it polls for work and starts containers successfully?
r
Hi @Dylan, Yup sure does. It is able to retrieve the runs successfully and even successfully pulls in the docker image from GCS.
d
Can you try explicitly providing a URI to the agent start command:
Copy code
prefect agent docker start --api <http://localhost:4200>
r
Is this the uri of the prefect server? Mine is on port 8080.
d
That’s the one
I think that setting may tell the agent how to configure the api path in the flow
But if that doesn’t work, please let me know
r
Okay got it, thanks. I'm just generating the docker image and pushing to GCR right now - might take a bit...
d
👍 no rush on my end 😛
😄 1
Btw the default port is actually
localhost:4200
r
Oh and that's the graphql server?
d
Technically it’s apollo which then fetches the schema from graphql (these are named services in prefect server)
👍 1
It’s the externally-facing API which is a graphql endpoint
r
Hmm... no I'm afraid that didn't do the trick. I'm getting the same error.
d
hmmmm
r
Copy code
prefect agent docker start --api <http://localhost:4200>

[2020-11-19 22:40:55,824] INFO - agent | Starting DockerAgent with labels []
[2020-11-19 22:40:55,824] INFO - agent | Agent documentation can be found at <https://docs.prefect.io/orchestration/>
[2020-11-19 22:40:55,824] INFO - agent | Agent connecting to the Prefect API at <http://localhost:4200>
[2020-11-19 22:40:55,852] INFO - agent | Waiting for flow runs...
[2020-11-19 22:43:13,018] INFO - agent | Found 1 flow run(s) to submit for execution.
[2020-11-19 22:43:13,119] INFO - agent | Deploying flow run ce49e7a2-e8f6-4103-88c4-cfe371225d03
[2020-11-19 22:43:13,120] INFO - agent | Pulling image <http://gcr.io/aa-mlops-dev-inm5/prefect-etl-storage:0.1.0|gcr.io/aa-mlops-dev-inm5/prefect-etl-storage:0.1.0>...
[2020-11-19 22:43:17,278] INFO - agent | Successfully pulled image <http://gcr.io/aa-mlops-dev-inm5/prefect-etl-storage:0.1.0|gcr.io/aa-mlops-dev-inm5/prefect-etl-storage:0.1.0>...
Kubernetes Pods Logs:
Copy code
[2020-11-19 22:43:21+0000] ERROR - prefect.CloudTaskRunner | Failed to set task state with error: ConnectionError(MaxRetryError("HTTPConnectionPool(host='host.docker.internal', port=4200): Max retries exceeded with url: /graphql (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f223bdd7b70>: Failed to establish a new connection: [Errno -2] Name or service not known'))"))
d
looking at the traceback is seems like something on Linux is preventing it from communicating back to host.docker.internal
We’ll pick the thread back up on this issue https://github.com/PrefectHQ/server/issues/25
r
Could this be the issue? "get_docker_ip()" isn't return anything?
Copy code
from prefect.utilities.docker_util import get_docker_ip
print(get_docker_ip())
d
🧐
r
Ohhhh. Okay I'm on my Mac on my local machine. When I use my remote machine using Ubuntu, then it returns something.
d
right
we’re going to pick up the thread with you on this github issue
Once we have some bandwidth to spin up an instance and get into the weeds
Keep an eye on this issue and let us know if you make any progress
r
Okay sounds good, thanks. I guess development will be stalled a bit. It was working fine in August though. Then I returned to this project in November and tried to run using the same build script and it failed. Not sure if that's an important detail, but thought I'd point it out.
d
Interesting. Would you happen to know your version(s) in August? Add any and all details you can think of to the issue, please 😄 Any information helps!
r
Let me check my github repo history.
Before, I was using 0.12.6. Also should note that before, I was using the same docker image for the dask workers and flow storage, which inherited from prefecthq/prefect:0.12.6-python3.7. Now, I'm using the daskdev/dask docker image for the dask workers and prefecthq/prefect:0.13.15-python3.7 for flow storage.
@Dylan - Just thought I would let you know that I switched to using the newly released helm chart to deploy Prefect Server on Kubernetes and then used a Kubernetes Agent and now my flow is working just fine. Still not able to get it working using a Compute Engine instance w/ Docker Agent though, but perfectly ok with Prefect Server on GKE instead.
d
Glad you were able to figure something out!