How do I limit concurrent execution of a deployment with a p Prefect Community #ask-marvin

Join Slack

How do I limit concurrent execution of a deploymen...

# ask-marvin

Igor Kaluder

04/25/2024, 8:55 PM

How do I limit concurrent execution of a deployment with a particular parameter value?

Nate

04/25/2024, 8:58 PM

@Marvin How do I limit concurrent execution of a deployment with a particular parameter value?

Marvin

04/25/2024, 8:59 PM

To limit concurrent execution of a deployment based on a specific parameter value, you can use Prefect's global concurrency limits. This feature allows you to define concurrency limits that can be used across flows and deployments to ensure certain resources are not overused, such as the number of database connections. Concurrency limits can be applied using the

concurrency

context manager in your flow, which allows you to specify the number of concurrent tasks or flow runs and manage resource usage efficiently. Here's an example snippet from the Prefect documentation:

Copy code

python
from prefect import flow, concurrency

@flow
def my_flow(query):
    with concurrency("database", occupy=1):
        connection = psycopg2.connect("<connection_string>")
        cursor = connection.cursor()
        cursor.execute(query)
        result = cursor.fetchall()
        connection.close()
    return result

if __name__ == "__main__":
    queries = ["SELECT * FROM table1", "SELECT * FROM table2", "SELECT * FROM table3"]
    for query in queries:
        my_flow.submit(query)

In this example, the

concurrency

context manager is used with the limit named "database". The

occupy

parameter can be set to control how many concurrent activities can occur. You can create a different concurrency limit based on your parameter values and use that inside your flows. You can read more about global concurrency limits in the Prefect documentation here: Global Concurrency Limits.

Nate

04/25/2024, 8:59 PM

and fwiw i dont believe we have first class concurrency limits that can be based on param values honestly global concurrency (like above^) and

run_deployment

might make sense depending on your use case

Igor Kaluder

04/25/2024, 9:03 PM

Thanks Nate, we have an ETL pipeline ingesting from 10 tables. We loop over a list of tables and do run_deployment(parameters={"table": table}. I was wondering if there's a way to set the concurrency limit so there can be only one flow run for a particular table running at a time.

Igor Kaluder

04/25/2024, 9:10 PM

Maybe we could set the idempotency_key to equal the table name?

7 Views

Open in Slack

Previous Next