Hello everyone! How are concurrency limit collisio...
# ask-community
l
Hello everyone! How are concurrency limit collision strategies intended to work? I set up a deployment to run every 30 seconds with a concurrency limit of 1 and a collision strategy of
CANCEL_NEW
, but I'm having it call a task that sleeps for 5 minutes as a test. I expected that when the scheduler tried to trigger a new run while the old one was still going, it either wouldn't start it or would cancel it immediately. Instead, I ended up with an ever-increasing queue of waiting flow runs. Is this the expected behavior? If so, what is the intended usage of
CANCEL_NEW
, and is there a way to avoid creating new scheduled runs if the previous run didn't finish as soon as expected?
b
Hey Lee! So the
CANCEL_NEW
collision strategy should cancel new runs that exceed the concurrency limit provided for a deployment (rather than queuing them). Once the limit is hit, the new runs should be cancelled immediately.
What are the states of the flow runs that are amassing in the queue? 👀
l
They are all
Late
b
Hmm..when you go to look at the collision strategy on the deployment, does it show
CANCEL_NEW
?
^this page shows up when you go to edit the deployment
l
Looks right to me.
b
weird. mind dropping code you're using to test this too?
l
I can't drop the exact code but I'll try to get a minimal reproduction.
b
That'd be perfect, thank you!
l
This seems sufficient to reproduce it. flows.py:
Copy code
from prefect import flow
from time import sleep


@flow
def slow_flow():
    sleep(90)
prefect.yaml:
Copy code
# Welcome to your prefect.yaml file! You can use this file for storing and managing
# configuration for deploying your flows. We recommend committing this file to source
# control along with your flow code.

# Generic metadata about this project
name: slow_flow_test

# build section allows you to manage and build docker images
build: null

# push section allows you to manage if and how this project is uploaded to remote locations
push: null

# pull section allows you to provide instructions for cloning this project in remote locations
pull: null

# the deployments section allows you to provide configuration for deploying flows
deployments:
    - name: "slow_flow"
      entrypoint: flows.py:slow_flow
      work_pool:
        name: slow_flow_pool
      schedule:
          interval: 30
      concurrency_limit:
          limit: 1
          collision_strategy: CANCEL_NEW
Then run something like this:
Copy code
prefect server start &
sleep 10

export PREFECT_API_URL=<http://localhost:4200/api>

prefect work-pool create slow_flow_pool --type process
prefect work-pool set-concurrency-limit slow_flow_pool 1
prefect deploy --all
prefect worker start --pool slow_flow_pool --type process --limit 1 &
After looking at it I'm guessing the issue is that there is a concurrency limit on both the work pool and the deployment.
Well, maybe not. Manually removing the concurrency limit on the pool didn't change anything.
It looks like removing the concurrency limits on both the work pool and the worker causes the extra runs to be canceled, as expected. I can work around that for now but it is a bit confusing.
gratitude thank you 1
b
Interesting, good thinking testing that out.
I'll give this a whirl to try and replicate, and share with the team. Maybe I'm missing something here too, but I feel like the runs should still be clearing out regardless of there being a work pool/worker concurrency limit in place. 🤔
l
Sounds great, thanks!
b
Hey Lee! I was able to reproduce what you saw yesterday. Effectively, if there was a limit on either the work pool or worker, the late runs would begin to show up. I created an issue here: https://github.com/PrefectHQ/prefect/issues/16984
let me know if there's any other details you'd like to add. you're welcome to comment there as well.