Riley Hun12/05/2020, 9:35 PM
from outside the cluster w/in my own local machine's CLI, the error goes away and the flow runs successfully. Some insight on this would be much appreciated. I can confirm that the Prefect version of the deployed agent is correct too. Here's the error I'm getting from the using built-in agent that comes with the helm chart deployment:
prefect agent kubernetes start
[2020-12-05 10:30:03+0000] ERROR - prefect.CloudTaskRunner | Failed to set task state with error: ConnectionError(MaxRetryError("HTTPConnectionPool(host='prefect-server-gke-apollo.default', port=4200): Max retries exceeded with url: /graphql//graphql (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5a560d3690>: Failed to establish a new connection: [Errno -2] Name or service not known'))")) Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/urllib3/connection.py", line 157, in _new_conn (self._dns_host, self.port), self.timeout, **extra_kw File "/opt/conda/lib/python3.7/site-packages/urllib3/util/connection.py", line 61, in create_connection for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM): File "/opt/conda/lib/python3.7/socket.py", line 752, in getaddrinfo for res in _socket.getaddrinfo(host, port, family, type, proto, flags): socket.gaierror: [Errno -2] Name or service not known
in the agent deployment yaml file So something like this:
command: - bash - "-c" - "prefect agent kubernetes start --api <apollo_endpoint>"
as it’s trying to use the in-cluster ip resolution e.g.
environment variable in the agent deployment
Riley Hun12/07/2020, 8:57 PM