Hi everyone I m running into the following error ```Connecti Prefect Community #ask-community

Hi everyone; I’m running into the following error:...

Jake

02/15/2022, 6:59 PM

Hi everyone; I’m running into the following error:

Copy code

ConnectionClosedError('Connection was closed before we received a valid response from endpoint URL: "<insert_s3_url_here>.prefect_result".',)

It is causing our flows to fail intermittently and I’m not sure how to debug this.

Kevin Kho

02/15/2022, 7:06 PM

I think that is specifically a boto3 error. Are you writing a big file (or is your task returning something big?). I think we can try increasing the timeout

Jake

02/15/2022, 7:11 PM

I don’t think it’s a big file; it seems to be happening when trying to save a prefect result

Kevin Kho

02/15/2022, 7:11 PM

Do you need the result? Cuz you can always turn checkpointing off if you don’t need it

Jake

02/15/2022, 7:12 PM

What exactly is the result? I believe all our writes are to a db so I don’t think the result is needed.

Kevin Kho

02/15/2022, 7:14 PM

The result is used in the event you need to restart the Flow. Prefect can restart from failure because it can load the upstream result. Or if you cache something across Flow runs. Prefect will be able to pull it from the Result. If you don’t need it, you can turn it off at the task level

@task(checkpoint=False)

👀 1

Kevin Kho

02/15/2022, 7:15 PM

So the result is the task output (the return) and this is persisted in the event you need it later

Jake

02/17/2022, 7:51 PM

I’ve looked into the sizes; they are typically really small - 15KB max. This issue happens intermittently; but the frequency of which it is occurring is increasing.

Kevin Kho

02/17/2022, 7:53 PM

Do you have a lot of concurrent AWS connections? I suspect this could also be rate limiting?

Jake

02/17/2022, 7:55 PM

potentially.. thanks for the suggestion; I will check!

Jake

02/17/2022, 8:17 PM

I don’t think this is due to rate-limiting as according to this documentation; we should get “3,500 PUT/COPY/POST/DELETE” requests per second. To my understanding; we only need to make a request once per task and we have way less than that many task and they take way longer than 1 second to complete

Kevin Kho

02/17/2022, 8:25 PM

So it’s hard to tell what could cause but it seems to be infrastructure. This guy says it was related to firewall blocking (what I imagine). The answer below that said setting the region helped (I wouldn’t know why). These also seem to indicate firewall. And this Github issue said it was due to uploading a zip but is unresolved. I guess on your end, you could also try seeing if it’s specific to the upload or you can try increasing the timeout (but it doesn’t seem to be a timeout either). This is specifically an AWS log that happens between the connection of your machine to S3, so I really don’t know how else we can help here unfortunately.

👀 1

Jake

03/25/2022, 8:03 PM

Update: this issue is still happening intermittently. We have noticed this specific boto issue in our own codebase (unrelated to prefect). We built some simple retry logic for when this happens and it would be a cool feature if Prefect could offer something similar (an automatic retry for this specific amazon problem)?

Kevin Kho

03/25/2022, 8:38 PM

If this in a task, I’m missing why the task level retry can’t handle this?

Kevin Kho

03/25/2022, 8:38 PM

Oh it’s in the result….

Kevin Kho

03/25/2022, 8:39 PM

Have you looked into adding a retry on the boto client maybe? Cuz this can be done in the awsconfig and then Prefect’s usage of boto can pick it up. Talking about these ? Have not used them myself

Jake

03/30/2022, 5:33 PM

I will look into that; it seems feasible! It’s not a high priority for us at the moment but I will update here if it ever gets implemented 🙂

Kevin Kho

03/30/2022, 5:45 PM

Ok sounds good!

19 Views

Open in Slack

Previous Next