https://prefect.io logo
r

Robin

11/06/2020, 2:05 PM
Dear prefect people, we get plenty of
No heartbeat detected from the remote task; marking the run as failed.
errors. 💔 (e.g. https://cloud.prefect.io/accure/task-run/8e44bd11-e7c3-4982-bcf2-7711bc1ca4a9?logId=83475348-2531-4efb-9663-33748515908e) This is particularly unexpected, as most of those tasks seem to have already almost finished... 🤔 How do you approach these errors and debug them? 🙂
m

Marwan Sarieddine

11/06/2020, 2:30 PM
I suppose for starters you can disable heartbeat checks and see if your flow is still facing any issues - you can toggle off heartbeat from the UI or using graphql ...
d

Dylan

11/06/2020, 3:38 PM
Hi Robin! Marwan’s suggestion is a good one. Also, sometimes users experience this issue when their tasks are resource-starved and the thread that sends heartbeats has trouble getting resources. So you might try increasing resources to your tasks and see if the issue resolves
r

Robin

11/06/2020, 3:39 PM
OK, merci!
I did not find the heartbeat settings in the UI
How could I do it with graphql?
d

Dylan

11/06/2020, 3:43 PM
You’ll find it on the Settings page for that Flow
r

Robin

11/06/2020, 3:44 PM
LOL, since when does that one exist? 😄 Never saw it 😮
d

Dylan

11/06/2020, 3:45 PM
I think we added it in February 😉
r

Robin

11/06/2020, 3:46 PM
OK, well, better late than never 🙂 But might also be worth to make it more visible?
d

Dylan

11/06/2020, 3:47 PM
I’ll definitely pass your feedback along 👍
We appreciate your input! 😄
r

Robin

11/06/2020, 3:53 PM
Haha, thanks for helping out! 🙂
Right now I have no flow running but task concurrency shows 10/10 🤔 Maybe the heartbeat deactivation did not do well in combination with task concurrency limits?
d

Dylan

11/06/2020, 7:08 PM
What does your In Progress Flow Runs tile showing on the Dashboard?
r

Robin

11/06/2020, 9:06 PM
Please find attached the screenshots, showing no flows running, two agents querying and 10/13 task concurrency 🤷‍♂️
d

Dylan

11/06/2020, 10:43 PM
Hey @Robin Looks like you might have some tasks left in a running state in a cancelled flow run which is causing the issue
You can find the offending Task Runs with the GraphQL api and set them to
Failed
r

Robin

11/07/2020, 9:36 AM
OK, where can I find the related commands?
@Dylan, it seems like the tasks are still "running". What are the commands to find and stop the respective tasks?
d

Dylan

11/11/2020, 5:51 PM
Hi @Robin if you visit the Interactive API page you’ll find the full API documenation
r

Robin

11/11/2020, 6:01 PM
OK, thanks for pointing to the docs. However, I don't understand from it how to e.g. show all running tasks 🤔
d

Dylan

11/11/2020, 6:22 PM
You’ll want to construct a query that looks something like this:
Copy code
query {
  task_run(where: {state: {_eq: "Running"}}) {
    id
    flow_run {
      id
      name
    }
  }
}
👍 1
Then you can use a mutation (I believe it’s called
set_task_run_state
) to set the state
Here’s some helpful docs on graphql and the underlying technology, hasura, that powers our API: • https://graphql.org/https://hasura.io/docs/1.0/graphql/core/index.html
r

Robin

11/11/2020, 6:27 PM
Thanks a lot for the additional information!! 🙂
d

Dylan

11/11/2020, 6:33 PM
Anytime!
r

Robin

11/11/2020, 6:48 PM
I just also asked for a kill switch feature in general and created this kill switch feature request, as suggested in this discussion.
👍 1
2 Views