https://prefect.io logo
Title
a

Alex F

03/09/2022, 4:58 PM
Hi All , I am having issue with prefect server. Then server is starting on server ip and config.toml is set to apollo_url="http://10.1.x.x:4200/graphql" everything starts normally and no errors during start , the ui shows green connection to API, however 1-2 minutes later it turns yellow and stops working , i turn debug=true but I am not able to find logs are written or how to troubleshoot this problem. Any help would be appreciated (this is local install on ubuntu)
k

Kevin Kho

03/09/2022, 5:04 PM
How did you start your server? prefect server start or helm?
a

Alex F

03/09/2022, 5:10 PM
like this - > prefect server start --detach
k

Kevin Kho

03/09/2022, 5:12 PM
Maybe you need the expose flag?
prefect server start --expose
a

Alex F

03/09/2022, 5:37 PM
I tried , no log entries coming up from graphql or port 4200 , all the log entries are ui_1 or towel_1, trying wget on port 4200 gets error 200 I did see this in startup graphql_1 | Could not upgrade the database! graphql_1 | Error: HTTPConnectionPool(host='hasura', port=3000): Max retries exceeded with url: /v1/query (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7ff7e3b873d0>: Failed to establish a new connection: [Errno 111] Connection refused'))
k

Kevin Kho

03/09/2022, 5:40 PM
Are your ports open for access on the server?
a

Alex F

03/09/2022, 5:41 PM
yes , as far as i know systemd-r 415 systemd-resolve 13u IPv4 19227 0t0 TCP 127.0.0.53:53 (LISTEN) sshd 771 root 3u IPv4 24314 0t0 TCP *:22 (LISTEN) sshd 771 root 4u IPv6 24325 0t0 TCP *:22 (LISTEN) container 17698 root 12u IPv4 114583 0t0 TCP 127.0.0.1:37907 (LISTEN) docker-pr 1294360 root 4u IPv4 4606111 0t0 TCP 127.0.0.1:5432 (LISTEN) docker-pr 1294817 root 4u IPv4 4609192 0t0 TCP *:4200 (LISTEN) docker-pr 1295053 root 4u IPv4 4608779 0t0 TCP *:8080 (LISTEN) docker-pr 1295269 root 4u IPv4 4611242 0t0 TCP 127.0.0.1:3000 (LISTEN) docker-pr 1295281 root 4u IPv4 4611253 0t0 TCP 127.0.0.1:4201 (LISTEN)
k

Kevin Kho

03/09/2022, 5:42 PM
Ohh this isn’t even about remote connection. Looks like hasura is not even spinning up properly? I should say the logs are noisy though so you can test better by trying to go to port 4200 to the graphql playground. Does that load for you?
a

Alex F

03/09/2022, 5:43 PM
no 4200 does not load at all I see bunch of errors on hasura start ui_1 | 2022/03/09 17:29:46 [notice] 13#13: start worker process 15 apollo_1 | Checking GraphQL service at http://graphql:4201/health ... towel_1 | {"severity": "ERROR", "name": "prefect-server.Scheduler", "message": "Unexpected error: ConnectError(gaierror(-3, 'Temporary failure in name resolution'))", "exc_info": "Traceback (most recent call last):\n File \"/prefect-server/src/prefect_server/services/loop_service.py\", line 60, in run\n await self.run_once()\n File \"/prefect-server/src/prefect_server/services/towel/scheduler.py\", line 47, in run_once\n offset=500 * iterations,\n File \"/prefect-server/src/prefect_server/database/orm.py\", line 501, in get\n as_box=not apply_schema,\n File \"/prefect-server/src/prefect_server/database/hasura.py\", line 85, in execute\n as_box=as_box,\n File \"/prefect-server/src/prefect_server/utilities/graphql.py\", line 64, in execute\n timeout=30,\n File \"/usr/local/lib/python3.7/site-packages/httpx/_client.py\", line 1385, in post\n timeout=timeout,\n File \"/usr/local/lib/python3.7/site-packages/httpx/_client.py\", line 1148, in request\n request, auth=auth, allow_redirects=allow_redirects, timeout=timeout,\n File \"/usr/local/lib/python3.7/site-packages/httpx/_client.py\", line 1169, in send\n request, auth=auth, timeout=timeout, allow_redirects=allow_redirects,\n File \"/usr/local/lib/python3.7/site-packages/httpx/_client.py\", line 1196, in send_handling_redirects\n request, auth=auth, timeout=timeout, history=history\n File \"/usr/local/lib/python3.7/site-packages/httpx/_client.py\", line 1232, in send_handling_auth\n response = await self.send_single_request(request, timeout)\n File \"/usr/local/lib/python3.7/site-packages/httpx/_client.py\", line 1269, in send_single_request\n timeout=timeout.as_dict(),\n File \"/usr/local/lib/python3.7/site-packages/httpcore/_async/connection_pool.py\", line 153, in request\n method, url, headers=headers, stream=stream, timeout=timeout\n File \"/usr/local/lib/python3.7/site-packages/httpcore/_async/connection.py\", line 65, in request\n self.socket = await self._open_socket(timeout)\n File \"/usr/local/lib/python3.7/site-packages/httpcore/_async/connection.py\", line 86, in _open_socket\n hostname, port, ssl_context, timeout\n File \"/usr/local/lib/python3.7/site-packages/httpcore/_backends/auto.py\", line 38, in open_tcp_stream\n return await self.backend.open_tcp_stream(hostname, port, ssl_context, timeout)\n File \"/usr/local/lib/python3.7/site-packages/httpcore/_backends/asyncio.py\", line 234, in open_tcp_stream\n stream_reader=stream_reader, stream_writer=stream_writer\n File \"/usr/local/lib/python3.7/contextlib.py\", line 130, in __exit__\n self.gen.throw(type, value, traceback)\n File \"/usr/local/lib/python3.7/site-packages/httpcore/_exceptions.py\", line 12, in map_exceptions\n raise to_exc(exc) from None\nhttpcore._exceptions.ConnectError: [Errno -3] Temporary failure in name resolution"}
after server restart I see it's up and running but 1-2 minutes late it's crashed with no errors ui_1 | 192.168.192.7 - - [09/Mar/2022:17:46:08 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.74.0" "-" graphql_1 | INFO: 192.168.192.4:52928 - "GET /health HTTP/1.1" 200 OK graphql_1 | INFO: 192.168.192.4:52930 - "GET /health HTTP/1.1" 200 OK ui_1 | 192.168.192.7 - - [09/Mar/2022:17:46:38 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.74.0" "-" graphql_1 | INFO: 192.168.192.4:52932 - "GET /health HTTP/1.1" 200 OK ui_1 | 192.168.192.7 - - [09/Mar/2022:17:47:09 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.74.0" "-" graphql_1 | INFO: 192.168.192.4:52934 - "GET /health HTTP/1.1" 200 OK
k

Kevin Kho

03/09/2022, 5:50 PM
That is very weird behavior. What are the specs of the underlying VM?
a

Alex F

03/09/2022, 5:53 PM
it's AWS Ubuntu instance.. if you tell me command you want me to run i can give you more details , I see docker instances are running 8GB.
k

Kevin Kho

03/09/2022, 5:57 PM
is the container not even around to get logs from?
a

Alex F

03/09/2022, 5:59 PM
container is up and running but on 192.168 subnet , but for some strange reason the doker is not mapping to 10.1 (local subnet) it's like network between docker and local network is not working properly 56ecc673097a prefecthq/ui:core-1.0.0 "/docker-entrypoint.…" 11 minutes ago Up 11 minutes (healthy) 80/tcp, 0.0.0.0:8080->8080/tcp tmp_ui_1 516b57f11ba7 prefecthq/server:core-1.0.0 "tini -g -- python s…" 11 minutes ago Up 11 minutes tmp_towel_1 3e11f00441f7 prefecthq/apollo:core-1.0.0 "tini -g -- bash -c …" 11 minutes ago Up 11 minutes (healthy) 0.0.0.0:4200->4200/tcp tmp_apollo_1 0a7b1510a618 prefecthq/server:core-1.0.0 "tini -g -- bash -c …" 11 minutes ago Up 11 minutes (healthy) 127.0.0.1:4201->4201/tcp tmp_graphql_1 1f7b76dc022f hasura/graphql-engine:v2.1.1 "graphql-engine serve" 11 minutes ago Up 11 minutes (healthy) 127.0.0.1:3000->3000/tcp tmp_hasura_1 41b8009f4001 postgres:11 "docker-entrypoint.s…" 11 minutes ago Up 11 minutes (healthy) 127.0.0.1:5432->5432/tcp tmp_postgres_1 ubuntu@i-0fbe8a4a692337a4b:~$
now i see all the logs but on docker subnets ui_1 | 192.168.208.7 - - [09/Mar/2022:17:58:10 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.74.0" "-" graphql_1 | INFO: 192.168.208.4:43936 - "GET /health HTTP/1.1" 200 OK graphql_1 | INFO: 192.168.208.4:43938 - "GET /health HTTP/1.1" 200 OK ui_1 | 192.168.208.7 - - [09/Mar/2022:17:58:40 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.74.0" "-" graphql_1 | INFO: 192.168.208.4:43940 - "GET /health HTTP/1.1" 200 OK ui_1 | 192.168.208.7 - - [09/Mar/2022:17:59:11 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.74.0" "-" graphql_1 | INFO: 192.168.208.4:43942 - "GET /health HTTP/1.1" 200 OK towel_1 | {"severity": "DEBUG", "name": "prefect-server.ZombieKiller", "message": "Heartbeat from ZombieKiller: next run at 2022-03-09T18:01:39+00:00"} ui_1 | 192.168.208.7 - - [09/Mar/2022:17:59:41 +0000] "HEAD / HTTP/1.1" 200 0 "-" "curl/7.74.0" "-" graphql_1 | INFO: 192.168.208.4:43944 - "GET /health HTTP/1.1" 200 OK
Kevin , I want to say thank you , I got my network guy review AWS security setup and it looks like it started working , Thank you for your help to point me in the right direction
👍 1
k

Kevin Kho

03/09/2022, 6:44 PM
What!? I just got off a call. What did they do? Would like to know lol
a

Alex F

03/09/2022, 7:45 PM
lol 🙂 all they did is open VM to be able to communicate on VM IP:4200 , only 8080 was open as i was configuring UI on 10.1.x.x:8080 IP and API needs to communicate on same IP 10.1.x.x:4200 in order for it to work
a

Anna Geller

03/09/2022, 7:46 PM
that's true! thanks for updating us on that and good job! 🙌
a

Alex F

03/09/2022, 7:47 PM
I do have another problem now 🙂 if you don't mind 🙂 prefect register --project "Project Name" --path ./Prefect/ I am getting error : raise InvalidSchema("No connection adapters were found for {!r}".format(url)) requests.exceptions.InvalidSchema: No connection adapters were found for '10.1.x.x:4200'
a

Anna Geller

03/09/2022, 7:50 PM
Could you open this as a separate issue? 🙏 I sort of closed this thread and transcribed to Discourse https://discourse.prefect.io/t/when-starting-prefect-server-on-aws-ec2-using-ubuntu-os-[…]onnection-errno-111-connection-refused-how-to-fix-this/460
we generally prefer one thread per problem/issue for better structure and to make things easier to follow up
a

Alex F

03/09/2022, 7:54 PM
Absolutely Anna.
:upvote: 1