Marwan Sarieddine
05/20/2022, 11:57 AMprefecthq/prefect:0.14.22-python3.8
Here is a snippet of the logs
[2022-05-20 08:00:00,000] INFO - prefect-agent-staging | Deploying flow run ccefa859-6380-48c7-9a58-c7f1030cb294 to execution environment...
INFO:prefect-agent-staging:Deploying flow run ccefa859-6380-48c7-9a58-c7f1030cb294 to execution environment...
[2022-05-20 08:00:01,276] INFO - prefect-agent-staging | Completed deployment of flow run ccefa859-6380-48c7-9a58-c7f1030cb294
INFO:prefect-agent-staging:Completed deployment of flow run ccefa859-6380-48c7-9a58-c7f1030cb294
INFO:prefect-agent-staging:Deploying flow run a8647852-1667-461f-a3ee-e749b580f2ac to execution environment...
[2022-05-20 08:00:37,759] INFO - prefect-agent-staging | Deploying flow run a8647852-1667-461f-a3ee-e749b580f2ac to execution environment...
WARNING:urllib3.connectionpool:Retrying (Retry(total=5, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='<http://api.prefect.io|api.prefect.io>', port=443): Read timed out. (read timeout=15)")': /
As you can see the first flow run (ccefa859-6380-48c7-9a58-c7f1030cb294
) at 8:00:000 UTC is deployed just fine
The second flow run a8647852-1667-461f-a3ee-e749b580f2ac
however fails deployment
given the agent shows this warning that indicates an error happened:
WARNING:urllib3.connectionpool:Retrying (Retry(total=5, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='<http://api.prefect.io|api.prefect.io>', port=443): Read timed out. (read timeout=15)")': /
Anna Geller
05/20/2022, 12:08 PMMarwan Sarieddine
05/20/2022, 12:09 PMAnna Geller
05/20/2022, 12:12 PMMarwan Sarieddine
05/20/2022, 12:12 PMAnna Geller
05/20/2022, 12:14 PMMarwan Sarieddine
05/20/2022, 12:15 PMDeploying flow run
<http://self.logger.info|self.logger.info>(
f"Deploying flow run {flow_run.id} to execution environment..."
)
self._mark_flow_as_submitted(flow_run)
# Call the main deployment hook
deployment_info = self.deploy_flow(flow_run)
Before the agent starts creating the kubernetes job in self.deploy_flow
it has to run _mark_flow_as_submitted
where it makes a _self_.client.set_flow_run_state
call and then a series of _self_.client.set_task_run_state
given the connection error, I think one of these client calls is failing and is not being retried correctlyAnna Geller
05/20/2022, 12:57 PMMarwan Sarieddine
05/20/2022, 1:52 PMAnna Geller
05/20/2022, 3:21 PMMarwan Sarieddine
05/20/2022, 4:54 PM_mark_flow_as_submitted
the error is clearly happening in deploy_flow
here https://github.com/PrefectHQ/prefect/blob/0.14.22/src/prefect/agent/kubernetes/agent.py#L418self.batch_client.create_namespaced_job(
namespace=self.namespace, body=job_spec
)
Anna Geller
05/20/2022, 5:17 PM