<@ULVA73B9P> I added a flow run concurrency limit ...
# ask-marvin
w
@Marvin I added a flow run concurrency limit to my ecs worker pool of 20 and after that none of my flow runs would start and were all stuck in scheduled state. When I removed this limit, all jobs started as per usual
m
It seems that there are currently no open issues related to ECS worker pool concurrency causing flow runs to be stuck in the scheduled state on the PrefectHQ GitHub repository. The behavior you're experiencing could be due to a configuration issue or an underlying problem with Prefect's handling of concurrency limits in conjunction with your ECS worker pool. Since setting a concurrency limit of 20 led to flow runs not starting, it's possible that the concurrency control mechanism might not have been operating correctly, potentially due to how the ECS workers and Prefect server interact to manage flow run execution. To troubleshoot this further, I would recommend checking the following: 1. Review the Prefect server logs to see if they provide any clues about why flow runs remained scheduled. 2. Ensure that the concurrency limit set was properly propagated to the ECS workers and that Prefect was aware of this configuration. 3. Validate that your ECS workers are properly registered and that there are no connectivity issues between them and the Prefect server. 4. Review your prefetch configuration, if any, to ensure it aligns with the ECS settings. 5. Check other Prefect settings, such as heartbeats, to confirm there are no other limitations affecting the flow runs. If after reviewing these points the issue persists and it is not something easily identifiable, it may be worthwhile to reach out to the Prefect community or to open a new issue on the Prefect GitHub repository with detailed information about your setup and the issue you encountered.
w
@Nate do you know why setting flow run concurrency limit on my work pool would stop any flows from running?
n
what’s the limit set at? do you have an old run that blipped off the map while still in running or something?
w
I have set it at 10
When I have set it, no flow runs will start
When i remove the limit, flow runs start as usual
n
i guess i’m wondering if there could have been a situation where prefect lost track of some number of runs while they were in RUNNING, so as far as the API knows, they still are. you could use client.read_flow_runs with a flow run filter to test this hypothesis
w
Hi Nate, I've just used that code and there are no running flows
Copy code
from prefect import get_client
import asyncio

from prefect.client.schemas import StateType
from prefect.client.schemas.filters import (
    FlowRunFilter,
    FlowRunFilterState,
    FlowRunFilterStateType,
)


async def main():
    async with get_client() as client:
        runs = await client.read_flow_runs(
            flow_run_filter=FlowRunFilter(
                state=FlowRunFilterState(
                    type=FlowRunFilterStateType(any_=[StateType.RUNNING])
                )
            )
        )
        print(len(runs))
        print(runs)


asyncio.run(main())
n
hmmm i can’t say i’m sure then 🤔
w
I also tried adding a limit to the work queue associated with the work pool in question. This had the same result
👍 1
n
weird - i’ll look at this more tomorrow
w
Okay thanks
n
also question, is this a push pool or not?
w
I'm not too sure sorry. It is a work pool that has an ecs worker which pulls from the queue and kicks off ECS fargate tasks for each flow run.
So the ECS worker is always running, but the Fargate tasks it kicks off for each flow run are not always running, they are started when a flow run is started
👍 1
n
oh ok, a push pool just means we run the worker for you 👍 so it sounds like no in your case, thanks helpful info, thanks!
w
Also not sure if it makes a difference but I am self hosting prefect server
@Nate found the issue. According to the docs, concurrency limit count runs in PENDING or RUNNING state (https://docs.prefect.io/latest/concepts/work-pools/#managing-concurrency). The issue was there was a bunch of runs stuck in PENDING state from months back which caused the limit to block any extra jobs from running. Once I deleted these old pending jobs, everything worked as expected
n
ahh great to hear you got it sorted!
m
This one caught me once again. I think there is an open issue on showing stale flow runs.