https://prefect.io logo
Title
l

Luke Orland

03/18/2020, 7:57 PM
We are getting some workflow runs failing immediately on Prefect Cloud with just one log message:
March 17th 2020 at 2:32:51am | agent
ERROR 
An error occurred (ThrottlingException) when calling the RunTask operation (reached max retries: 4): Rate exceeded.
z

Zachary Hughes

03/18/2020, 7:58 PM
Hi Luke, sorry you're running into this. Taking a look now.
l

Luke Orland

03/18/2020, 7:58 PM
ok, Thanks!
z

Zachary Hughes

03/18/2020, 8:00 PM
Quick follow-up question: is your flow doing anything with AWS?
l

Luke Orland

03/18/2020, 8:02 PM
other runs of the same flow (with different parameter values) are failing with a different single log message:
March 17th 2020 at 2:32:59am | agent
ERROR 
list index out of range
accessing S3
z

Zachary Hughes

03/18/2020, 8:06 PM
And what does your flow execution setup look like? This looks like something on the AWS end, but trying to narrow down where it could be happening. Happy to take this to DM if you'd rather not share specifics of how you run your work.
l

Luke Orland

03/18/2020, 8:08 PM
so my colleague kicked off 365 parameterized runs of the same workflow in a for loop 🙂
😂 2
z

Zachary Hughes

03/18/2020, 8:09 PM
Welp, that could definitely trigger some AWS throttling.
Happy to help with any other questions you have, but that sounds like the culprit.
l

Luke Orland

03/18/2020, 8:11 PM
the first task isn't accessing S3 though...
Is the agent being throttled from too many ECR requests?
z

Zachary Hughes

03/18/2020, 8:12 PM
Without knowing more about how you're setup, my guess would be that something in the ECR/ECS or EKS API is what's throwing that.
l

Luke Orland

03/18/2020, 8:12 PM
yeah it's a FargateAgent
z

Zachary Hughes

03/18/2020, 8:12 PM
Because it sounds like you're seeing this error message before the flow even has a chance to get submitted.
l

Luke Orland

03/18/2020, 8:12 PM
yeah
Ok, i think we'll have to do some negative engineering 😉
maybe sleep between
client.create_flow_run
calls to space them out.
z

Zachary Hughes

03/18/2020, 8:18 PM
That could do the trick in the short term. I also know that when I was in AWS consulting, we could contact support to get certain limits raised. YMMV there.
l

Luke Orland

03/18/2020, 8:21 PM
Would it make sense for me to open an issue / feature request to implement retries in the FargateAgent to handle throttling errors from AWS?
z

Zachary Hughes

03/18/2020, 8:23 PM
If nothing else, it's definitely worth discussion! Go for it.
👍 1
l

Luke Orland

03/18/2020, 8:39 PM