04/25/2023, 10:47 PM
@Marvin my flow is getting crashed often because of this exception "Flow run infrastructure exited with non-zero status code 143."


04/25/2023, 10:48 PM
I found a related discussion on the Prefect Discourse forum that might help you with the issue you're facing. The error you mentioned, "Flow run infrastructure exited with non-zero status code 143," could be related to issues with the agent process, execution layer, or resource allocation. Here are some steps you can take to resolve the issue: 1. Verify that the agent process is running (e.g., Kubernetes deployment, ECS service, dockerd daemon). 2. Check the agent logs for any suspicious activity. 3. Ensure that your execution layer can pull your flow run's image and has the necessary permissions. 4. Verify that your execution layer has enough permissions to spin up the required resources (e.g., IAM roles, valid Prefect API key). 5. Check if your execution layer has enough capacity on the cluster to deploy your flow run. If the agent is starved for resources, try allocating more CPU and memory to the agent process. 6. If the agent is polling too frequently and consuming lots of resources, try decreasing the poll frequency to, e.g., 30 seconds:
prefect config set PREFECT_AGENT_QUERY_INTERVAL='30.0'
. 7. Check if there is more than one agent polling for runs from the same work queue, as this can lead to pending runs that can't get deployed efficiently. You can find more details in the Prefect Discourse discussion. I hope this helps you resolve the issue.