Hi everyone; I’m running into the following error:...
# ask-community
j
Hi everyone; I’m running into the following error:
Copy code
ConnectionClosedError('Connection was closed before we received a valid response from endpoint URL: "<insert_s3_url_here>.prefect_result".',)
It is causing our flows to fail intermittently and I’m not sure how to debug this.
k
I think that is specifically a boto3 error. Are you writing a big file (or is your task returning something big?). I think we can try increasing the timeout
j
I don’t think it’s a big file; it seems to be happening when trying to save a prefect result
k
Do you need the result? Cuz you can always turn checkpointing off if you don’t need it
j
What exactly is the result? I believe all our writes are to a db so I don’t think the result is needed.
k
The result is used in the event you need to restart the Flow. Prefect can restart from failure because it can load the upstream result. Or if you cache something across Flow runs. Prefect will be able to pull it from the Result. If you don’t need it, you can turn it off at the task level
@task(checkpoint=False)
👀 1
So the result is the task output (the return) and this is persisted in the event you need it later
j
I’ve looked into the sizes; they are typically really small - 15KB max. This issue happens intermittently; but the frequency of which it is occurring is increasing.
k
Do you have a lot of concurrent AWS connections? I suspect this could also be rate limiting?
j
potentially.. thanks for the suggestion; I will check!
I don’t think this is due to rate-limiting as according to this documentation; we should get “3,500 PUT/COPY/POST/DELETE” requests per second. To my understanding; we only need to make a request once per task and we have way less than that many task and they take way longer than 1 second to complete
k
So it’s hard to tell what could cause but it seems to be infrastructure. This guy says it was related to firewall blocking (what I imagine). The answer below that said setting the region helped (I wouldn’t know why). These also seem to indicate firewall. And this Github issue said it was due to uploading a zip but is unresolved. I guess on your end, you could also try seeing if it’s specific to the upload or you can try increasing the timeout (but it doesn’t seem to be a timeout either). This is specifically an AWS log that happens between the connection of your machine to S3, so I really don’t know how else we can help here unfortunately.
👀 1
j
Update: this issue is still happening intermittently. We have noticed this specific boto issue in our own codebase (unrelated to prefect). We built some simple retry logic for when this happens and it would be a cool feature if Prefect could offer something similar (an automatic retry for this specific amazon problem)?
k
If this in a task, I’m missing why the task level retry can’t handle this?
Oh it’s in the result….
Have you looked into adding a retry on the boto client maybe? Cuz this can be done in the awsconfig and then Prefect’s usage of boto can pick it up. Talking about these ? Have not used them myself
j
I will look into that; it seems feasible! It’s not a high priority for us at the moment but I will update here if it ever gets implemented 🙂
k
Ok sounds good!