Gabriel Milan
03/18/2022, 1:17 PMhasura:
image:
name: hasura/graphql-engine
tag: v1.3.3
pullPolicy: IfNotPresent
pullSecrets: []
service:
type: ClusterIP
port: 3000
labels: {}
annotations: {}
replicas: 2
strategy: {}
podSecurityContext: {}
securityContext: {}
env: []
resources:
limits:
cpu: "500m"
memory: "1Gi"
requests:
cpu: "100m"
memory: "256Mi"
nodeSelector: {}
tolerations: []
affinity: {}
both the pods crash at the same time and keep restarting a few times before it's back up, so my Server gets a significant amount of downtime. I don't think the resources are underestimated, but I do see a pattern on RAM usage when this happens (I've sent a screenshot). I've also attached some logs for the pods, but I can't find any relevant information on them. Is there anywhere else I could gather information for this issue?Kevin Kho
03/18/2022, 1:56 PMAnna Geller
03/18/2022, 2:44 PMunlocking events that are locked by the HGE
which doesn't indicate anything suspicious per sethe pods crash at the same time and keep restarting a few times before it's back upI remember that your networking setup is quite involved. Given that Hasura pods are coming back up quickly again, it might be some transient networking issue in your Kubernetes service
Gabriel Milan
03/18/2022, 3:22 PMit might be some transient networking issue in your Kubernetes serviceI did think about that, but the thing is none of the other stuff we host on this cluster went down, which is weird. I've scaled it down to a single replica and now it looks like it's stable, but I still don't understand why
btw I remember you promised a blog post or a code repository about your setup on Kubernetesyeah! we'll work on it soon, I hope! I'll be glad to share it when it's ready!