Stephen Lloyd

07/17/2023, 11:08 AM
Hi. I have a flow that calls an api with start and end date parameters. I'm seeing the following behavior: • run locally with a week date range • run on ECS and Prefect Online with a week date range ◦ the task with the api call does not fail it simply continues to run forever ◦ this same flow/task ran perfectly on Prefect 1.0 online • run on ECS and Prefect online with a one day date range The process itself has LOTS of resources. The API call takes awhile - 12-15 min for the one-week date range. • We cannot find anything in AWS that could cause a timeout • We are using the requests library and do not have any timeouts set. • We have other long-running tasks and flows (i.e. dbt) so we don't think it is any kind of Prefect timeout issue at the task or flow level I can't think of what else to add at the moment. I'd love any ideas on how to find the problem!


07/17/2023, 12:27 PM
Split into multiple runs?

Dominic Tarro

07/17/2023, 12:42 PM
Do you have CloudWatch configured? I have had situations where there is a connection issue to the Prefect client that crashes the ECS Task but because it cannot connect to Prefect Cloud the status on your dashboard doesn't update.
Typically I would find them in the individual flow run's logs.

Stephen Lloyd

07/18/2023, 3:47 AM
@Susan Joshi Can we check on this today?
Hi @Dominic Tarro. We do have Cloudwatch turned on and we route logs to New Relic. The process hangs/stops on the API call.
Well, it seems like some kind of timeout problem. We've circumvented it for now by mapping multiple smaller API calls out. Thanks.