Hi all I m having issues with my Hasura pods It just keeps r Prefect Community #prefect-server

Hi all! I'm having issues with my Hasura pods. It ...

Gabriel Milan

03/18/2022, 1:17 PM

Hi all! I'm having issues with my Hasura pods. It just keeps restarting every once in a while and I'm not sure about why this is happening. The values for the helm chart I'm using for Hasura are the following:

Copy code

hasura:
  image:
    name: hasura/graphql-engine
    tag: v1.3.3
    pullPolicy: IfNotPresent
    pullSecrets: []

  service:
    type: ClusterIP
    port: 3000

  labels: {}
  annotations: {}
  replicas: 2
  strategy: {}
  podSecurityContext: {}
  securityContext: {}
  env: []
  resources:
    limits:
      cpu: "500m"
      memory: "1Gi"
    requests:
      cpu: "100m"
      memory: "256Mi"
  nodeSelector: {}
  tolerations: []
  affinity: {}

both the pods crash at the same time and keep restarting a few times before it's back up, so my Server gets a significant amount of downtime. I don't think the resources are underestimated, but I do see a pattern on RAM usage when this happens (I've sent a screenshot). I've also attached some logs for the pods, but I can't find any relevant information on them. Is there anywhere else I could gather information for this issue?

prefect-hasura-57f8b8d4bb-zdd65.log prefect-hasura-57f8b8d4bb-68z2z.log

Kevin Kho

03/18/2022, 1:56 PM

I looked at the error, and I am not sure if it’s related to scaling down? Didn’t see any issues around those logs.

Kevin Kho

03/18/2022, 1:57 PM

Wondering if going to hasura 2.0 might help you

Kevin Kho

03/18/2022, 2:12 PM

Will see if the team has any other ideas

Kevin Kho

03/18/2022, 2:21 PM

The guys who’d have a clue are out today unfortunately

Anna Geller

03/18/2022, 2:44 PM

I agree that the logs don't provide much useful information, especially because there is no error there. This is just an event log:

Copy code

unlocking events that are locked by the HGE

which doesn't indicate anything suspicious per se

Anna Geller

03/18/2022, 2:45 PM

the pods crash at the same time and keep restarting a few times before it's back up

I remember that your networking setup is quite involved. Given that Hasura pods are coming back up quickly again, it might be some transient networking issue in your Kubernetes service

Anna Geller

03/18/2022, 2:46 PM

btw I remember you promised a blog post or a code repository about your setup on Kubernetes - I'm counting on it! 😄 no pressure though, only if you find time to share your learnings

Gabriel Milan

03/18/2022, 3:22 PM

it might be some transient networking issue in your Kubernetes service

I did think about that, but the thing is none of the other stuff we host on this cluster went down, which is weird. I've scaled it down to a single replica and now it looks like it's stable, but I still don't understand why

Gabriel Milan

03/18/2022, 3:24 PM

btw I remember you promised a blog post or a code repository about your setup on Kubernetes

yeah! we'll work on it soon, I hope! I'll be glad to share it when it's ready!

7 Views

Open in Slack

Previous Next