https://prefect.io logo
#prefect-community
Title
# prefect-community
a

alex

07/20/2022, 2:46 PM
I have been running into an issue with Prefect cloud 1.0 where my flow and task concurrency slots are being used up while the UI reports that no flows are running. There are also flows that have been "cancelled" using the UI and still show up as greyed out boxes with "cancelling..." written for many days. Has anyone run into this issue before?
I ran this query in the hopes that I could identify why my task concurrency limits were being used up when the cloud UI was reporting that no flows were running.
Copy code
query get_running_flows {
  flow_run(where: {state: {_eq: "Running"}}) {
      state
      created
    	flow {
        id
      }
  }
}
I received this error
Copy code
{
  "errors": [
    {
      "path": [
        "flow_run",
        0,
        "created"
      ],
      "message": "Cannot return null for non-nullable field flow_run.created.",
      "extensions": {
        "code": "INTERNAL_SERVER_ERROR"
      }
    }
  ],
  "data": null
}
So it looks like the associated flow is no longer available. I removed the
flow
clause from the api to just get the flow run ids and tried deleting them using
Copy code
mutation delete_flow_run {
  deleteFlowRun(input: {flowRunId: 
    "87b8fa4f-a206-438f-95b2-364ea4187338"}) {
    success,
    error
  }
}
but I get this response
Copy code
{
  "data": {
    "deleteFlowRun": {
      "success": false,
      "error": null
    }
  }
}
My flows run either using kubernetes agents or local agents and this issue impacts both of them
k

Kevin Kho

07/20/2022, 9:26 PM
We can find the tasks specifically with a query one sec let me get the query
Can you try something like this::
Copy code
query {
  task_run (where: {state: {_eq: "Running"},
                    task: {tags: {_contains: "tag"}}}) {
    id
    flow_run{
      name
      id
    }
    task{
      name
      tags
    }
  }
}
?
a

alex

07/20/2022, 9:53 PM
Hi @Kevin Kho thank you for the snippet, it was very helpful. I ran the query and saw some very olds tasks that flow_runs that were running. I tried clearing the flow run using
Copy code
mutation delete_flow_run {
  deleteFlowRun(input: {flowRunId: 
    "d781167d-aa8f-49e1-a5a9-ab548813191b"}) {
    success,
    error
  }
}
but now I'm getting this error for the original query you shared
Copy code
{
  "errors": [
    {
      "path": [
        "task_run",
        0,
        "id"
      ],
      "message": "Cannot return null for non-nullable field task_run.id.",
      "extensions": {
        "code": "INTERNAL_SERVER_ERROR"
      }
    }
  ],
  "data": null
}
How would you recommend I clear these old flows?
Hi @Kevin Kho thank you for your response. I just wanted to follow up and see if you could help me identify what is causing this issue and the best way to resolve it. There appears to be data inconsistency on the cloud server side and this is causing issues with my production flows as they are consuming concurrency limits. Using this command
Copy code
query check_flows_with_labels {
  flow_run(where: {state: {_eq: "Running"}, labels: {_contains: "limited-label"}}) {
    id
    created
    name
  }
}
I was able to identify some mysterious old flow runs that are not showing up in the UI Adding a
flow { name }
clause to the query above gives this error
Copy code
{
  "errors": [
    {
      "path": [
        "flow_run",
        0,
        "id"
      ],
      "message": "Cannot return null for non-nullable field flow_run.id.",
      "extensions": {
        "code": "INTERNAL_SERVER_ERROR"
      }
    }
  ],
  "data": null
}
Trying to cancel one of the mysterious flow runs gives this issue
Copy code
mutation cancel_flow_run {
  cancel_flow_run(input: {flow_run_id: "cd1924f1-a0f3-4cc5-a35b-322d4e034e2b"}) {
    state
  }
}
Copy code
{
  "errors": [
    {
      "path": [
        "cancel_flow_run"
      ],
      "message": "'NoneType' object has no attribute 'flow_group_id'",
      "extensions": {
        "code": "INTERNAL_SERVER_ERROR"
      }
    }
  ],
  "data": {
    "cancel_flow_run": null
  }
}
Hi @Kevin Kho following up here. I understand you must be busy with the hard work you team has been doing on Prefect 2.0 but this is a big source of concern for my team and is impacting my flows in production. I have raised this before in a thread with @Anna Geller. tldr: Is that my task concurrency task limits are being used up by flows that apparently do not exist anymore. This is preventing newer flows from running.
a

Anna Geller

08/12/2022, 5:48 PM
@alex please don't tag anyone. Kevin is no longer at Prefect and if you have some urgent issues that affect your production workloads, you can reach out to the paid support channel cs@prefect.io by looking at this, I don't immediately see what the root cause is Can you perhaps try removing any concurrency limits, reregister your flows and then really concurrency limits to check if that helps?
7 Views