Hello everyone, I'm currently trying out Prefect a...
# ask-community
r
Hello everyone, I'm currently trying out Prefect and I was curious about the development life cycle in a remote environment. I set up an EC2 instance where I started a prefect server, and I can connect to the UI without issue. Now I deployed a flow and I'm trying to run an agent on the same machine to see whether it picks up runs, but the work queue remain unhealthy. I get that the agent doesn't know how to communicate with the api, but I don't really know how to solve this issue. I ran the agent with
--api http://<remote-ip>:4200
or
--api <http://localhost:4200>
or
--api <http://127.0.0.1:4200>
or
--api <http://0.0.0.0:4200>
; the result is the same. If I spin the agent locally while specifying the remote api url, I get the agent to pick up the runs though. Does that mean that, for some reason, it is not possible to run an agent on the same machine as the Prefect server? Is there something I'm missing here.
z
Did you start the server with
--host 0.0.0.0
?
I’m a bit confused. The agent works on another machine but not on the one where the server is running?
r
I ran
prefect server start --host 0.0.0.0
then
prefect agent start -q default
on the same machine (in two different terminals).
And yes, the agent works fine on my local machine (with
--api http://<remote-ip>:4200
)
z
If you do
PREFECT_API_URL=<http://localhost:4200/api> prefect agent start -q default
what happens?
r
The work queue remains unhealthy 😕
Could it be a connectivity issue? Super-weird though since its on the same machine.
z
Yeah very weird. What does the agent say on startup?
r
Copy code
[prefect@ip-xxx-xx-xx-xx ec2-user]$ prefect agent start --work-queue default
Starting v2.10.11 agent connected to <http://xxx.xxx.xxx.xxx:4200/api>...

  ___ ___ ___ ___ ___ ___ _____     _   ___ ___ _  _ _____
 | _ \ _ \ __| __| __/ __|_   _|   /_\ / __| __| \| |_   _|
 |  _/   / _|| _|| _| (__  | |    / _ \ (_ | _|| .` | | |
 |_| |_|_\___|_| |___\___| |_|   /_/ \_\___|___|_|\_| |_|


Agent started! Looking for work from queue(s): default...
z
Well it sure looks like it’s working 😄
Perhaps there’s a bug with queue health?
r
Is there a way to get the agent to be a bit more verbose? Like a log of the polling attempts?
z
Yeah, PREFECT_LOGGING_LEVEL=DEBUG
r
Ok it doesnt log much.
While trying to connect to http//&lt;remote ip&gt;4200/api from the browser, I get this :
{"detail":"Not Found"}
I don't know if this is normal behavior
z
That’s normal there’s nothing to view on that page
http://localhost:4200/api/docs would get you a page
r
Ok. The agent does time out after a while it seems.
I guess it's normal too.
Alright let me know if you need more infos to reproduce. I could also raise a Github issue if necessary!
z
What do you mean it times out?
I definitely can’t reproduce with what I have here. We run agents on the same host as the server all the time.
r
The agent logs this error on repeat :
Copy code
15:29:31.034 | DEBUG   | prefect.client - Encountered retryable exception during request. Another attempt will be made in 27.52727884860604s. This is attempt 5/6.
Traceback (most recent call last):
  File "/home/prefect/.local/lib/python3.9/site-packages/httpcore/backends/asyncio.py", line 114, in connect_tcp
    stream: anyio.abc.ByteStream = await anyio.connect_tcp(
  File "/home/prefect/.local/lib/python3.9/site-packages/anyio/_core/_sockets.py", line 221, in connect_tcp
    await event.wait()
  File "/home/prefect/.local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 597, in __aexit__
    raise exceptions[0]
  File "/home/prefect/.local/lib/python3.9/site-packages/anyio/_core/_sockets.py", line 167, in try_connect
    stream = await asynclib.connect_tcp(remote_host, remote_port, local_address)
  File "/home/prefect/.local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 1627, in connect_tcp
    await get_running_loop().create_connection(
  File "/usr/lib64/python3.9/asyncio/base_events.py", line 1050, in create_connection
    sock = await self._connect_sock(
  File "/usr/lib64/python3.9/asyncio/base_events.py", line 961, in _connect_sock
    await self.sock_connect(sock, address)
  File "/usr/lib64/python3.9/asyncio/selector_events.py", line 500, in sock_connect
    return await fut
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/prefect/.local/lib/python3.9/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
    yield
  File "/home/prefect/.local/lib/python3.9/site-packages/httpcore/backends/asyncio.py", line 121, in connect_tcp
    stream._raw_socket.setsockopt(*option)  # type: ignore[attr-defined] # pragma: no cover
  File "/home/prefect/.local/lib/python3.9/site-packages/anyio/_core/_tasks.py", line 119, in __exit__
    raise TimeoutError
TimeoutError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/prefect/.local/lib/python3.9/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "/home/prefect/.local/lib/python3.9/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/home/prefect/.local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 261, in handle_async_request
    raise exc
  File "/home/prefect/.local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 245, in handle_async_request
    response = await connection.handle_async_request(request)
  File "/home/prefect/.local/lib/python3.9/site-packages/httpcore/_async/connection.py", line 92, in handle_async_request
    raise exc
  File "/home/prefect/.local/lib/python3.9/site-packages/httpcore/_async/connection.py", line 69, in handle_async_request
    stream = await self._connect(request)
  File "/home/prefect/.local/lib/python3.9/site-packages/httpcore/_async/connection.py", line 117, in _connect
    stream = await self._network_backend.connect_tcp(**kwargs)
  File "/home/prefect/.local/lib/python3.9/site-packages/httpcore/backends/auto.py", line 31, in connect_tcp
    return await self._backend.connect_tcp(
  File "/home/prefect/.local/lib/python3.9/site-packages/httpcore/backends/asyncio.py", line 121, in connect_tcp
    stream._raw_socket.setsockopt(*option)  # type: ignore[attr-defined] # pragma: no cover
  File "/usr/lib64/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/prefect/.local/lib/python3.9/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectTimeout

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/prefect/.local/lib/python3.9/site-packages/prefect/client/base.py", line 193, in _send_with_retry
    response = await request()
  File "/home/prefect/.local/lib/python3.9/site-packages/httpx/_client.py", line 1617, in send
    response = await self._send_handling_auth(
  File "/home/prefect/.local/lib/python3.9/site-packages/httpx/_client.py", line 1645, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/home/prefect/.local/lib/python3.9/site-packages/httpx/_client.py", line 1682, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/home/prefect/.local/lib/python3.9/site-packages/httpx/_client.py", line 1719, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/home/prefect/.local/lib/python3.9/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/lib64/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/prefect/.local/lib/python3.9/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectTimeout
z
Ah okay so it can’t talk to the API
I can only assume this is a weird thing about the machine you’re on
We run agents against local servers for testing on multiple operating systems
r
Ok I think I figured it out. When I spun the server initially with
--host 0.0.0.0
I had an error from the UI that it could not find the api. I then tried a few things before finding out that I had to set the
PREFECT_SERVER_API_HOST
to
<remote-ip>
. I think I messed up with the wrong variable in the process, probably setting
PREFECT_API_URL
to
<remote-ip>
was a bad move.
So for future reference, on EC2 :
Copy code
prefect config set PREFECT_SERVER_API_HOST=<remote-ip>
prefect server start --host 0.0.0.0
Thanks Zanie for taking the time on this one 🙂
z
Actually you want to set the host to 0.0.0.0 and the
PREFECT_UI_API_URL
to the remote address for the API
👀 2
Copy code
PREFECT_UI_API_URL = Setting(
    str,
    default=None,
    value_callback=default_ui_api_url,
)
"""The connection url for communication from the UI to the API.
Defaults to `PREFECT_API_URL` if set. Otherwise, the default URL is generated from
`PREFECT_SERVER_API_HOST` and `PREFECT_SERVER_API_PORT`. If providing a custom value,
the aforementioned settings may be templated into the given string.
"""