Hi guys! Is there any way to set task rate limits ...
# ask-community
i
Hi guys! Is there any way to set task rate limits in v3? Seems quite silly that this is not a part of the docs. Celery for example allows you to set task ingestion limits (ie. 2/s, 10/m and so on) directly through the task decorator. This way, when a "flow" runs, it makes sure that tasks are not consumed too quickly. As far as i can tell, the only way to configure this in prefect would be with YAML, and even then its a global rate limit. I need to set these limits at runtime and cannot be predefined in a yaml file. Take this example:
Copy code
@task
def add(x: int, y: int) -> int:
    return x + y

@task 
def multiply(x: int, y: int) -> int:
    return x * y

@flow
def add_and_multiply(x:int, y:int):
    sum = add(x, y)
    product = multiply(x, y)
    return sum, product
If the rate limit for
add
is 5/s and the rate limit for
multiply
is 200/s, how should I modify the code to accomplish this without having to define it in all my flows using these functions?
a
Curious to learn more - what’s your use case?
i
One of my usecases would be that I have something like this:
Copy code
@task # Rate limit 60/min
async def scrape_website(url: str) -> Dict:
    return {"html": "data"}

@task # Rate limit 20/min
async def analyze_website(html: str) -> Dict:
    return {"analysis": "data"}

@flow
async def fetch_data_from_website(url: str):
    scraped_data = scrape_website.submit(url).result()
    analyze_website.submit(scraped_data['html'])


async def main(urls):
    # Lets say that this pipe gets called 200/min
    tasks = [await fetch_data_from_website(url) for url in urls]
    return await asyncio.gather(*tasks)


if __name__ == "__main__":
    urls = ["<https://google.com>", "<https://facebook.com>"] * 100
    asyncio.run(main(urls))
I want to run these all concurrently, but I want it to wait for open slots if the rate limit has exceeded (ie. it waits 1s per
scrape_website
request and like 0.3s per
analyze_website
request). As far as I can tell, I cannot define this as python code and instead have to rely on yaml. which means I need to set global rates, which doesnt really make sense to me
n
hi @Imran Nooraddin have you read the docs on this?