https://prefect.io logo
Title
i

Ismail Cenik

05/18/2021, 3:56 PM
Hello, What is the reason for the following error? Pod prefect-job-9e6e019e-2tpvl failed. Container 'flow' state: terminated Exit Code:: 139 Reason: Error
k

Kevin Kho

05/18/2021, 4:00 PM
Hey @Ismail Cenik, could you give me more details about this? Was it a flow that was working before? Are there any logs on the flow?
What was the flow doing?
i

Ismail Cenik

05/18/2021, 4:04 PM
b'{"taskmanagers":5,"slots-total":35,"slots-available":3,"jobs-running":1,"jobs-finished":0,"jobs-cancelled":0,"jobs-failed":0,"flink-version":"1.11.1","flink-commit":"DeadD0d0"}'
The flow runs AWS Kinesis Data Analytics (starts), when they finish their job, then stops applications ... Basically prefect is calling startApplication and stopApplication APIs of Kinesis Data Analytics
k

Kevin Kho

05/18/2021, 4:06 PM
This looks more like the flow failed. Any logs you see on the CloudWatch side?
i

Ismail Cenik

05/18/2021, 4:08 PM
Do you mean that Kinesis fails, Kinesis CloudWatch?
k

Kevin Kho

05/18/2021, 4:10 PM
Not necessarily but just something in the Flow.
Is that flink-commit stuff from the Prefect logs?
i

Ismail Cenik

05/18/2021, 4:13 PM
Actually, this is not a good example, I showed the printout of one API call in the prefect log
I will try to find more meaningful logs. But there should be an explanation for "139". Is there any specific meaning?
k

Kevin Kho

05/18/2021, 4:15 PM
I see. I guess there might be more info in cloud watch hopefully?
i

Ismail Cenik

05/18/2021, 4:16 PM
I do not have direct access to the EKS where our agent is running. I will try to reach out.
k

Kevin Kho

05/18/2021, 4:17 PM
That’s the container error, not on Prefect so pretty much this
t

Tyler Wanner

05/18/2021, 4:33 PM
generally 139 means that your container is being sent SIGKILL, which is generally due to running out of memory or failing a liveness check
I can't provide a lot of additional context without knowledge about your execution environment, but your execution infrastructure is almost certainly what's sending SIGKILL to your flow containers. In kubernetes, this is usually fixed by increasing the memory resource limit
i

Ismail Cenik

05/18/2021, 4:42 PM
Hey guys, thank you for the valuable information
👍 1
Hello, there is no default value for the memory resource. Is there any standardization or recommendation for the memory resource limit?
k

Kevin Kho

05/18/2021, 6:20 PM
It’s hard for us to prescribe because that really varies on a case to case basis, but what we do mention is that the default Kubernetes specs tend to be low for what people do on Prefect.