I'm seeing erroneous Flow Run errors via Prefect C...
# prefect-cloud
k
I'm seeing erroneous Flow Run errors via Prefect Cloud Automations. Flows complete successfully (no task errors or retries), yet Automations sends a
State message: Flow run infrastructure exited with non-zero status code -1.
Quite noisy... Any advice?
Copy code
Prefect Webhook  [9:21 AM]
Prefect flow run failed
Flow run load-xxx/camouflaged-vole entered state Crashed at 2024-01-18T15:21:21.506133+00:00.
Flow ID: 9288...
Flow run ID: b5ea...
Flow run URL: <https://app.prefect.cloud/account/3da>...
State message: Flow run infrastructure exited with non-zero status code -1.
Prefect Notifications | Today at 9:21 AM
Seems like the FlowRun status temporarily went into Crashed status - but that's not apparent from the UI
in fact, its contradictory
I do see this odd log in our container
Copy code
{
insertId: "b1czr5dqsxaxsn0x"
labels: {4}
logName: "projects/.../logs/stderr"
receiveTimestamp: "2024-01-18T15:22:21.709721704Z"
resource: {
labels: {6}
type: "k8s_container"
}
severity: "ERROR"
textPayload: "15:22:19.215 | INFO    | prefect.flow_runs.runner - Process for flow run 'camouflaged-vole' exited cleanly."
timestamp: "2024-01-18T15:22:19.217081414Z"
}
I believe this stems from prefect logging to stderr instead of stdout for all messages, even when it's INFO. Any help on a possible solution would be magical 🙏
e
Hey @KG đź‘‹ are you using subflows or calling
run_deployment
at all? If so, this issue might be relevant. One quick way to mitigate the issue is to set the automation to a “stays in Crashed for x seconds” rather than “enters Crashed state”… that should at least buffer out any blips.
This issue might also be relevant.
Actually looks like there are quite a few different reports. Maybe a question for you is: 1. Are these flows long running at all (> 1h)? 2. What provider are you using for your k8s cluster?
k
Negative on subflows, but that 2nd issue could be relevant. I like your quick fix on the Automation - may try that 🙂 These are completing successfully, under 1min total runtime GKE for k8s
âś… 1
Can you recall how to customize the Automation to
set the automation to a "stays in Crashed for x seconds" rather than "enters Crashed state"
? Maybe we can't from UI
e
It should be easy to do in the UI:
k
"Stays in" duh