g

    George Coyne

    2 years ago
    A number of flows in prefect cloud are stuck in submitted and zombie killer is not putting them to bed, any suggestions on getting them processed?
    nicholas

    nicholas

    2 years ago
    Hi @George Coyne - that doesn't sound out of the ordinary, zombie killer is only responsible for dealing with tasks that are running that haven't issued a heartbeat in the last 2 minutes. For flows stuck in a submitted state, the lazarus process is responsible for re-scheduling those every ~6 minutes or so, to a maximum of 10 reschedules
    g

    George Coyne

    2 years ago
    Ah true
    OK manually going through and setting state to scheduled
    I have tasks queuing in submitted, I recently updated the agent, other than that nothing has changed in our cluster.
    Dylan

    Dylan

    2 years ago
    Hey @George Coyne what’s your new agent version?
    g

    George Coyne

    2 years ago
    0.12.6
    Dylan

    Dylan

    2 years ago
    We’re checking one thing on our end, just a moment
    In the meantime, anything look funky in your agent logs?
    Flow Run logs also clear? You’re running on kubernetes right? Are jobs being created properly?
    Your work queue looks like it’s functioning properly to us
    g

    George Coyne

    2 years ago
    Flow run logs are good, running on k8s, it just looks like some jobs aren’t actually queueing or something? I’m digging into it a little deeper but wanted to bring it to you guys sooner rather than later
    If I manually set these submitted flows back to schedule they get picked up
    Dylan

    Dylan

    2 years ago
    Hey @George Coyne just to double check, you’re saying that the first time the run is submitted the agent doesn’t create a k8s job but the second time the run goes into a submitted state the agent properly creates a job?
    g

    George Coyne

    2 years ago
    Inconsistently but yes, some flows get pulled perfectly, some just don’t. I can’t determine any consistency to the behavior.
    Dylan

    Dylan

    2 years ago
    In the logs for the flow run, there should be a job ID for the created job. Take a look at the jobs created with those IDs. What state are the jobs in? Are they “Unschedulable”?
    g

    George Coyne

    2 years ago
    That’s the thing, the jobs just don’t appear
    Checking agent logs
    Dylan

    Dylan

    2 years ago
    🧐
    g

    George Coyne

    2 years ago
    prefect-job-8ffbfe27
    for instance, stuck in submitted since 11:00 cdt
    Does not appear in agent logs, does not appear in kube jobs
    Dylan

    Dylan

    2 years ago
    Would you shoot me that flow run ID?
    g

    George Coyne

    2 years ago
    3c45f67b-ffe9-4f9c-b0fc-da48b4a4f31d
    Dylan

    Dylan

    2 years ago
    Is that flow run present in your agent logs?
    Crazy question: do you have a second agent running?
    j

    josh

    2 years ago
    Also one thing to confirm when checking for the job:
    kubectl get jobs --all-namespaces
    The fact that it’s stuck in submitted with a job ID attached means the agent is submitting it to k8s 🤔
    g

    George Coyne

    2 years ago
    I do have a second agent running
    I spun up a 11.5 agent this morning
    Dylan

    Dylan

    2 years ago
    ah, so it’s not picking up runs accidentally?
    g

    George Coyne

    2 years ago
    No jobs of in either cluster as of right now
    and I checked both agent logs for the flow mentioned above
    Will the second agent cause an issue?
    Dylan

    Dylan

    2 years ago
    Both agents will pick up flows unless you add labels to both agents and flows
    if this is what you’re looking to do, then that’s totally fine
    g

    George Coyne

    2 years ago
    Yep I love that too, but in this situation both are running without a label
    Dylan

    Dylan

    2 years ago
    Gotcha, so you’re just trying to get an agent to submit a flow run?
    Are the agents in the same cluster?
    g

    George Coyne

    2 years ago
    Agents are in seperate clusters
    Dylan

    Dylan

    2 years ago
    Is the new agent performing better?
    g

    George Coyne

    2 years ago
    I am trying to understand, and stop, flows from getting submitted without running
    Dylan

    Dylan

    2 years ago
    Understood
    Is the new agent accomplishing that goal?
    g

    George Coyne

    2 years ago
    Nope
    I was blindly hoping that it would catch whatever was being skipped/missed
    I’ll try removing the 12.6 agent