Thread
#prefect-community
    l

    Luke Orland

    2 years ago
    We are getting some workflow runs failing immediately on Prefect Cloud with just one log message:
    March 17th 2020 at 2:32:51am | agent
    ERROR 
    An error occurred (ThrottlingException) when calling the RunTask operation (reached max retries: 4): Rate exceeded.
    Zachary Hughes

    Zachary Hughes

    2 years ago
    Hi Luke, sorry you're running into this. Taking a look now.
    l

    Luke Orland

    2 years ago
    ok, Thanks!
    Zachary Hughes

    Zachary Hughes

    2 years ago
    Quick follow-up question: is your flow doing anything with AWS?
    l

    Luke Orland

    2 years ago
    other runs of the same flow (with different parameter values) are failing with a different single log message:
    March 17th 2020 at 2:32:59am | agent
    ERROR 
    list index out of range
    accessing S3
    Zachary Hughes

    Zachary Hughes

    2 years ago
    And what does your flow execution setup look like? This looks like something on the AWS end, but trying to narrow down where it could be happening. Happy to take this to DM if you'd rather not share specifics of how you run your work.
    l

    Luke Orland

    2 years ago
    so my colleague kicked off 365 parameterized runs of the same workflow in a for loop 🙂
    Zachary Hughes

    Zachary Hughes

    2 years ago
    Welp, that could definitely trigger some AWS throttling.
    Happy to help with any other questions you have, but that sounds like the culprit.
    l

    Luke Orland

    2 years ago
    the first task isn't accessing S3 though...
    Is the agent being throttled from too many ECR requests?
    Zachary Hughes

    Zachary Hughes

    2 years ago
    Without knowing more about how you're setup, my guess would be that something in the ECR/ECS or EKS API is what's throwing that.
    l

    Luke Orland

    2 years ago
    yeah it's a FargateAgent
    Zachary Hughes

    Zachary Hughes

    2 years ago
    Because it sounds like you're seeing this error message before the flow even has a chance to get submitted.
    l

    Luke Orland

    2 years ago
    yeah
    Ok, i think we'll have to do some negative engineering 😉
    maybe sleep between
    client.create_flow_run
    calls to space them out.
    Zachary Hughes

    Zachary Hughes

    2 years ago
    That could do the trick in the short term. I also know that when I was in AWS consulting, we could contact support to get certain limits raised. YMMV there.
    l

    Luke Orland

    2 years ago
    Would it make sense for me to open an issue / feature request to implement retries in the FargateAgent to handle throttling errors from AWS?
    Zachary Hughes

    Zachary Hughes

    2 years ago
    If nothing else, it's definitely worth discussion! Go for it.
    l

    Luke Orland

    2 years ago