Ian Thomas
12/18/2023, 7:03 PMIan Thomas
12/18/2023, 7:09 PM16:18:15.289 | INFO | prefect.flow_runs.worker - Worker 'KubernetesWorker 69f0259a-aa9d-492f-a735-9041ff25e12c' submitting flow run 'e4c0c8a9-c896-4145-b0e4-61b6259de0b2'
16:18:15.923 | INFO | prefect.flow_runs.worker - Creating Kubernetes job...16:18:16.166 | INFO | prefect.flow_runs.worker - Job 'indigo-dinosaur-qs8dx': Pod has status 'Pending'.
16:18:16+00:00: 0/9 nodes are available: 9 node(s) didn't match Pod's node affinity/selector. preemption: 0/9 nodes are available: 9 Preemption is not helpful for scheduling..
16:18:16+00:00: Created pod: indigo-dinosaur-qs8dx-n6vbr
16:18:16.174 | INFO | prefect.flow_runs.worker - Completed submission of flow run 'e4c0c8a9-c896-4145-b0e4-61b6259de0b2'
16:18:17+00:00: Pod should schedule on: machine/default-dthxn
16:18:51+00:00: Successfully assigned prefect-narwhals/indigo-dinosaur-qs8dx-n6vbr to ip-10-144-37-210.us-west-2.compute.internal
16:19:16.072 | ERROR | prefect.flow_runs.worker - Job 'indigo-dinosaur-qs8dx': Pod never started.
16:19:16.087 | INFO | prefect.flow_runs.worker - Pod event 'FailedScheduling' at 2023-12-18
16:19:16.088 | INFO | prefect.flow_runs.worker - Job event 'SuccessfulCreate' at 2023-12-18
16:19:16.089 | INFO | prefect.flow_runs.worker - Pod event 'Nominated' at 2023-12-18
16:19:16.089 | INFO | prefect.flow_runs.worker - Pod event 'Scheduled' at 2023-12-18
16:19:16.090 | INFO | prefect.flow_runs.worker - Pod event 'Pulling' at 2023-12-18 16:18:51+00:00: Pulling image "<http://730998372749.dkr.ecr.us-west-2.amazonaws.com/tackle-application-customer-access:c2aa66a8ed04689cbfe58c2ae079cb80c738c1b8|730998372749.dkr.ecr.us-west-2.amazonaws.com/tackle-application-customer-access:c2aa66a8ed04689cbfe58c2ae079cb80c738c1b8>"
16:19:16.268 | INFO | prefect.flow_runs.worker - Reported flow run 'e4c0c8a9-c896-4145-b0e4-61b6259de0b2' as crashed: Flow run infrastructure exited with non-zero status code -1.
The Flow does run without issue after the Node, provided by AWS as an On-Demand EC2 instance, becomes available allowing the Job's Pod to be scheduled. Is there a way have Prefect be more tolerant of an initial FailedScheduling event, or wait longer before assigning a Final State to the Flow?Ian Thomas
12/18/2023, 8:33 PM