On a K8s deployed Prefect server the graphql pod is stuck in Prefect Community #prefect-server

Join Slack

On a K8s deployed Prefect server, the graphql pod ...

# prefect-server

Alex Furrier

08/19/2021, 11:14 PM

On a K8s deployed Prefect server, the graphql pod is stuck in a CrashLoopBackoff. Not sure why it exited as Completed?

Alex Furrier

08/19/2021, 11:17 PM

I deleted the pod and when a new one was spun up it was succesful. Not sure what caused this issue though.

Kevin Kho

08/19/2021, 11:20 PM

Hey @Alex Furrier, from experience Server startup does hiccup sometimes and if it’s successful from the restart you should be fine. It’s weird also it seems good from these lgos. Could you move the logs to the thread when you get a chance so we don’t crowd the main channel?

👍 1

Alex Furrier

08/19/2021, 11:20 PM

These are the pod logs:

Copy code

{"severity": "INFO", "name": "prefect-server.GraphQL Server", "message": "Using uvicorn log level = 'debug'"}
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on <http://0.0.0.0:4201> (Press CTRL+C to quit)
INFO:     10.244.19.1:55684 - "GET /health HTTP/1.1" 200 OK
INFO:     10.244.19.1:55692 - "GET /health HTTP/1.1" 200 OK
INFO:     10.244.16.156:41468 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41482 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41144 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41504 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41508 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41356 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41514 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41526 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41592 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41590 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41380 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41236 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41596 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41378 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41234 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41600 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41238 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41688 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41694 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41700 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41704 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41702 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41706 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41712 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41318 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41320 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41718 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41720 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41722 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     Shutting down
INFO:     Waiting for connections to close. (CTRL+C to force quit)
INFO:     10.244.16.156:41862 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41012 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41014 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41868 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41886 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:40986 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41072 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41878 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41010 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:40958 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41724 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:42090 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     10.244.16.156:41726 - "POST /graphql/ HTTP/1.1" 200 OK
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [1]

And this is the last termination message:

Copy code

lastState:
        terminated:
          containerID: <containerd://f24f7214318a9605d579ea96b19b5fdbe8ba829ecbf1f4849d4ff0bd3aca5a2>7
          exitCode: 0
          finishedAt: "2021-08-19T23:08:36Z"
          reason: Completed
          startedAt: "2021-08-19T23:07:40Z"

Kevin Kho

08/19/2021, 11:21 PM

Thank you!

Alex Furrier

08/19/2021, 11:22 PM

Not sure if it's relevant but there were 2 flows running at the time the graphql went down. After restart 1 failed and 1 kept running. Probably some type of retry differences on the tasks running but maybe related to what caused it to crash in the first place.

Kevin Kho

08/19/2021, 11:24 PM

Oh I thought this was on spinup. That’s weird. I guess bring it up if you see it again? A bit hard to tell what happened.

Alex Furrier

08/19/2021, 11:31 PM

Could it be related to large log messages? I turned on some logging for debugging purposes that involved fairly large amount of text. Could the large logs somehow be causing crashes?

Kevin Kho

08/19/2021, 11:32 PM

You would get an API error saying that it was rejected because the entity request was too large, which we have some of lately

Sam Cook

08/20/2021, 1:30 PM

All of the Prefect pods have to come up in a very specific order or else they fail into unrecoverable states. The docker based deployment handles this with depends_on statements in the docker-compose, but there's not a analogous construct on k8s. https://github.com/PrefectHQ/prefect/blob/master/src/prefect/cli/docker-compose.yml The best you can do to get things in the right order is add init containers to the config that loop until the required services are up

👍 1

Alex Furrier

08/20/2021, 3:20 PM

@Sam Cook The issue doesn't appear to happen on container init but in the middle of a flow run. More on it in the other thread just below https://prefect-community.slack.com/archives/C014Z8DPDSR/p1629471848376900?thread_ts=1629432352.371900&cid=C014Z8DPDSR

Open in Slack

Previous Next