https://prefect.io logo
r

Rajeev Lahoty

07/28/2023, 4:52 PM
Hi everyone - some of our flow runs fail with the error - Submission failed. RuntimeError: Timed out after 240.0124716758728s while watching waiting for container start. And in Azure portal, I see a container instance in Failed status and the container in waiting status in that instance. Why does it happen? How do we handle this? Are there some environmental settings we should change etc?
👀 1
c

Christopher Boyd

07/28/2023, 5:20 PM
Hi Rajeev, How are these flows being deployed? Is this an agent, or a worker?
r

Rajeev Lahoty

07/28/2023, 5:21 PM
Hi Chris - using agent.
c

Christopher Boyd

07/28/2023, 5:24 PM
How are you deploying these flows?
r

Rajeev Lahoty

07/28/2023, 5:25 PM
the code is on github and we create deployments using python.
c

Christopher Boyd

07/28/2023, 5:25 PM
can you share the deployment code? 240s is peculiar
you likely won’t get more on the prefect side if the container fails to start as thats infrastructure provisioning
so it would be necessary to look in the resource-group where your agent is running and look at the Azure side
r

Rajeev Lahoty

07/28/2023, 5:26 PM
sample deployment code
import sys, os sys.path.append(os.path.abspath(os.path.join(os.path.dirname(file), '../../../Prefect'))) from prefect.deployments import Deployment from utilities.prefectconfig import get_default_settings from flows.teams.risk_analytics_daily import team_risk_analytics_daily_process github_block, azure_container_instance_job_block, work_pool_name, work_queue_name = get_default_settings() deployment = Deployment.build_from_flow( flow=team_risk_analytics_daily_process, name = "team-risk-analytics-daily-process", work_pool_name = work_pool_name, work_queue_name = work_queue_name, tags = ["teams", "loading analytics data"], storage = github_block, infrastructure=azure_container_instance_job_block ) if name == "__main__": deployment.apply()
Yeah something like 6 minutes of timeout somewhere.
Also working with infra team on our side.
c

Christopher Boyd

07/28/2023, 5:29 PM
azure_container_instance_job_block
- how is that configured?
r

Rajeev Lahoty

07/28/2023, 5:30 PM
I can send u screenshot if u want or if there is something specific u r looking at.
also this error is random. not on the same job or not at the same time.
c

Christopher Boyd

07/28/2023, 5:37 PM
If the error is random, i’d be suspect of Azure
👍 1
all prefect is doing is submitting an ARM template to Azure
r

Rajeev Lahoty

07/28/2023, 5:38 PM
Thanks Chris. Appreciate it.
This is for azure webapps, but I expect there is a similar config for ACI since I think they all use the same infrastructure on the Azure backend: