Hi, I had a vm that abruptly was turned off over t...
# prefect-community
r
Hi, I had a vm that abruptly was turned off over the weekend and now I cannot get prefect server UI to work. Is there a troubleshooting guide?
k
Hey @Richard Hughes, There isn't quite a troubleshooting guide besides the deployment docs and the fine folks at #prefect-server. Happy to help you get things back up and running though! Let us know if there are any specific issues we can help with.
r
Hi @Kyle Moon-Wright, every time I restart the server with
prefect server start
nothing seems to be working. I am going to try
docker system prune
and reset everything. I am at a loss of why it is not responding.
I am wondering if I need to uninstall and re-install prefect somehow.
k
Ack I hope not - though I'm sure that a reinstall will rectify these issues. If that ends up being the case, be sure to install the latest version of Prefect (currently 0.13.3). Otherwise, is there a specific error that arises when restarting Server?
r
reinstalling did the trick!
k
Whew, awesome! Hopefully that wasn't too arduous.
r
yea, I have 4 different servers, so it is some work
@Kyle Moon-Wright still having problem still seems like docker containers are getting hung
k
Hmm, I just checked with 13.3 and I've been able to pull Prefect Server and run the UI on my local machine. Can you see the services being pulled? Is there a particular place docker gets hung?
r
I just did a
docker system prune
again and trying to restart
prefect server start --use-volume --volume-path "D:\prefect"
k
Will it pull without the volume-path specified (just
prefect server start --use-volume
)? Do we have a running Docker and Docker-Compose that we can access?
r
yes, i see the follow
Copy code
PS C:\Windows\system32> docker ps -a
CONTAINER ID        IMAGE                          COMMAND                  CREATED             STATUS              PORTS                            NAMES
3ee85b517d61        prefecthq/ui:latest            "/docker-entrypoint.…"   37 minutes ago      Up 37 minutes       80/tcp, 0.0.0.0:8080->8080/tcp   cli_ui_1
333538cd9075        prefecthq/apollo:latest        "npm run serve"          37 minutes ago      Up 37 minutes       0.0.0.0:4200->4200/tcp           cli_apollo_1
f5778c9b2f87        prefecthq/server:latest        "python src/prefect_…"   37 minutes ago      Up 37 minutes                                        cli_towel_1
b2dcfcdaae30        prefecthq/server:latest        "bash -c 'prefect-se…"   38 minutes ago      Up 37 minutes       0.0.0.0:4201->4201/tcp           cli_graphql_1
e719fe8c5eaf        hasura/graphql-engine:v1.3.0   "graphql-engine serve"   38 minutes ago      Up 38 minutes       0.0.0.0:3000->3000/tcp           cli_hasura_1
6f77f20517de        postgres:11                    "docker-entrypoint.s…"   38 minutes ago      Up 38 minutes       0.0.0.0:5432->5432/tcp           cli_postgres_1
Copy code
[33mgraphql_1   |[0m Error: HTTPConnectionPool(host='hasura', port=3000): Max retries exceeded with url: /v1/query (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f15e35ae410>: Failed to establish a new connection: [Errno 111] Connection refused'))
k
Hmm, let me get some more information for you.
r
I tried to restart this container and it seems I cannot do that from
docker restart b2dcfcdaae30
k
Got some insight: it looks like this error usually occurs when postgres can't start because the port is already taken. Can we confirm that the postgres port is available? I believe it defaults to 5432.
r
here is the logs from postgres
Copy code
PostgreSQL Database directory appears to contain a database; Skipping initialization

2020-08-24 18:47:47.872 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2020-08-24 18:47:47.873 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2020-08-24 18:47:47.922 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-08-24 18:47:48.257 UTC [24] LOG:  database system was interrupted; last known up at 2020-08-24 18:32:42 UTC
2020-08-24 18:48:01.942 UTC [25] FATAL:  the database system is starting up
2020-08-24 18:48:02.059 UTC [26] FATAL:  the database system is starting up
2020-08-24 18:48:07.018 UTC [24] LOG:  database system was not properly shut down; automatic recovery in progress
2020-08-24 18:48:07.108 UTC [24] LOG:  redo starts at 0/18433B8
2020-08-24 18:48:07.112 UTC [24] LOG:  invalid record length at 0/18433F0: wanted 24, got 0
2020-08-24 18:48:07.112 UTC [24] LOG:  redo done at 0/18433B8
2020-08-24 18:48:07.559 UTC [1] LOG:  database system is ready to accept connections
k
Hey @Richard Hughes, Apologies for the delay! Got some further insight here - it looks likely that your DB was corrupted from whatever shut down the VM running your DB. Based on the log below in particular, it looks like it's failing to use the mounted volume you've specified for some reason:
Copy code
2020-08-24 18:48:07.018 UTC [24] LOG:  database system was not properly shut down; automatic recovery in progress
In which case, restoring from a backup if available may be your best option.
r
I created this database today, however, we can see after the backup.
k
Hmm interesting, please keep us posted.
r
@Kyle Moon-Wright after restoring the vm backup I was able to get the servers working, thanks for your assistance.