We have a flow task run that finished with the sta...
# prefect-server
p
We have a flow task run that finished with the status of
RUNNING
instead of
SUCCESS
or
FAILURE
and resulted in
PENDING
status in all downstream tasks. Can you please advise what such behavior happened and what we need to do on our side to address such behavior?
Logs:
Copy code
29 November 2021,12:17:54 	prefect.CloudFlowRunner	INFO	"Beginning Flow run for '30021ac5-7d89-43c2-ae17-81b439d50d68'"
29 November 2021,12:17:54 	prefect.CloudTaskRunner	INFO	"Task 'source': Starting task run..."
29 November 2021,12:17:54 	prefect.CloudTaskRunner	INFO	"Task 'source': Finished task run for task with final state: 'Running'"
29 November 2021,12:17:54 	prefect.CloudTaskRunner	INFO	"Task 'fork': Starting task run..."
29 November 2021,12:17:54 	prefect.CloudTaskRunner	INFO	"Task 'fork': Finished task run for task with final state: 'Pending'"
29 November 2021,12:17:54 	prefect.CloudTaskRunner	INFO	"Task 'preprocess': Starting task run..."
29 November 2021,12:17:54 	prefect.CloudTaskRunner	INFO	"Task 'preprocess': Finished task run for task with final state: 'Pending'"
29 November 2021,12:17:54 	prefect.CloudTaskRunner	INFO	"Task 'canary': Starting task run..."
29 November 2021,12:17:54 	prefect.CloudTaskRunner	INFO	"Task 'canary': Finished task run for task with final state: 'Pending'"
29 November 2021,12:17:55 	prefect.CloudTaskRunner	INFO	"Task 'concatenate': Starting task run..."
29 November 2021,12:17:55 	prefect.CloudTaskRunner	INFO	"Task 'concatenate': Finished task run for task with final state: 'Pending'"
29 November 2021,12:17:55 	prefect.CloudTaskRunner	INFO	"Task 'sink': Starting task run..."
29 November 2021,12:17:55 	prefect.CloudTaskRunner	INFO	"Task 'sink': Finished task run for task with final state: 'Pending'"
29 November 2021,12:17:55 	prefect.CloudFlowRunner	INFO	"Flow run RUNNING: terminal tasks are incomplete."
29 November 2021,12:28:27 	prefect-server.ZombieKiller.TaskRun	ERROR	No heartbeat detected from the remote task; marking the run as failed.
k
Is this task doing a long API call?
You can try changing the heartbeats to threads here and it might help
p
The source component is doing a snowflake query. I don’t believe it should be a long API call. I’ll try with different heartbeat mechanism to see if it solves the problem.
@William Schor
@David Harrington
@Kevin Kho still intermittently we get heartbeat failures even with threads. Do you have any other recommendations?
k
Normally the heartbeat not being detected is indicative that something happened to your flow, and then Prefect is just telling you that it lost communication. The most common reason I have seen for heartbeat failures is resource starvation. In Prefect 0.15.5 and up, the error message should be raised. If you are confident that the Flow will succeed and you think Prefect is mistakenly marking it as failed (very rare), you can try turning heartbeats off for the Flow. Or you can put that task into it’s own subflow and turn off heartbeats on the subflow.
p
Thanks for the suggestions, we have tried increasing the resources on the job, but it still fails intermittently, do you recommend turning off heartbeat? What would be missing if we turn the heartbeat off?
k
If you turn the heartbeat off and the compute dies, your Flow will be permanently seen as “Running” in the UI unless you mark it as Failed
p
Ok thanks for the info!