Hi team recently I am trying to migrate from prefect agent t Prefect Community #ask-community

Hi team, recently I am trying to migrate from pref...

Abyaya Lamsal

03/24/2025, 5:41 PM

Hi team, recently I am trying to migrate from prefect agent to worker in Prefect. At the same, I am also upgrading from

2.14.16

2.20.16

. I started seeing some intermittent issues with a subset of flows. This seems to happen randomly; hence, not every flow run has this issue. For reference, I am using a custom image. Attached logs below:

Copy code

13:29:55.058 | INFO    | prefect.flow_runs.worker - Worker 'KubernetesWorker a5d26a51-ff36-4697-8daf-f8aa3a0fea54' submitting flow run '855ead39-db97-4fa6-85b0-723ddd90b7c8'
13:29:55.236 | INFO    | prefect.flow_runs.worker - Creating Kubernetes job...
13:29:55.314 | INFO    | prefect.flow_runs.worker - Completed submission of flow run '855ead39-db97-4fa6-85b0-723ddd90b7c8'
13:29:55.349 | INFO    | prefect.flow_runs.worker - Job 'adept-hog-hwcjq': Pod has status 'Pending'.
13:30:55.327 | ERROR   | prefect.flow_runs.worker - Job 'adept-hog-hwcjq': Pod never started.
13:30:55.570 | INFO    | prefect.flow_runs.worker - Pod event 'Scheduled' at 2025-03-18 13:29:55+00:00: Successfully assigned [OUR_NAMESPACE]/adept-hog-hwcjq-pqhbc to <INTERNAL_NODE>
13:30:55.571 | INFO    | prefect.flow_runs.worker - Job event 'SuccessfulCreate' at 2025-03-18 13:29:55+00:00: Created pod: adept-hog-hwcjq-pqhbc
13:30:55.572 | INFO    | prefect.flow_runs.worker - Pod event 'Pulling' at 2025-03-18 13:29:56+00:00: Pulling image "<CUSTOM_IMAGE>"
13:30:55.572 | INFO    | prefect.flow_runs.worker - Pod event 'Pulled' at 2025-03-18 13:30:33+00:00: Successfully pulled image "<CUSTOM_IMAGE>" in 37.16s (37.16s including waiting). Image size: <SIZE> bytes.
13:30:55.716 | INFO    | prefect.flow_runs.worker - Reported flow run '855ead39-db97-4fa6-85b0-723ddd90b7c8' as crashed: Flow run infrastructure exited with non-zero status code -1.
<NORMAL EXECUTION>
...

The job eventually runs. The issue is if I subscribe to any failure notification, then I get randomly bombarded with crash notifications, which is not very helpful. Would appreciate any pointers here. Here is a sample of the job logs:

Abyaya Lamsal

03/25/2025, 4:16 PM

Hi team, this is blocking some migrations. Would love it if you have any possible insights on this

7 Views

Open in Slack

Previous Next