https://prefect.io logo
Title
b

Blake Stefansen

03/03/2023, 2:44 PM
Hi Everyone, Issue If I create a custom run for a deployment (with k8 job infrastructure) in our prefect cloud account, it creates 2 duplicate jobs in our k8 cluster for the same flow. Not sure what would cause this. Any ideas? Expected Behavior: Only one job should be created per flow fun Extra Info: Prefect Version: Job - 2.7.10 Prefect Version: Agent - 2.8.4 Agent Log
14:16:50.649 | INFO    | prefect.agent - Submitting flow run 'ef043a55-545c-4495-8a63-8263931c06e7'
14:16:51.894 | INFO    | prefect.infrastructure.kubernetes-job - Job 'flow-run-started-for-tx-csv-mock-csv-tpwv2': Pod has status 'Pending'.
14:16:51.897 | INFO    | prefect.agent - Completed submission of flow run 'ef043a55-545c-4495-8a63-8263931c06e7'
14:17:01.677 | INFO    | prefect.agent - Submitting flow run 'ef043a55-545c-4495-8a63-8263931c06e7'
14:17:02.388 | INFO    | prefect.infrastructure.kubernetes-job - Job 'flow-run-started-for-tx-csv-mock-csv-bnhgs': Pod has status 'Pending'.
14:17:02.429 | INFO    | prefect.agent - Completed submission of flow run 'ef043a55-545c-4495-8a63-8263931c06e7'
14:17:26.237 | INFO    | prefect.infrastructure.kubernetes-job - Job 'flow-run-started-for-tx-csv-mock-csv-tpwv2': Pod has status 'Running'.
14:17:29.652 | INFO    | prefect.infrastructure.kubernetes-job - Job 'flow-run-started-for-tx-csv-mock-csv-bnhgs': Pod has status 'Running'.
14:19:44.562 | ERROR   | prefect.infrastructure.kubernetes-job - Job 'flow-run-started-for-tx-csv-mock-csv-tpwv2': Job did not complete within timeout of 5s.
14:19:44.685 | ERROR   | prefect.infrastructure.kubernetes-job - Job 'flow-run-started-for-tx-csv-mock-csv-bnhgs': Job did not complete within timeout of 5s.
14:19:44.719 | INFO    | prefect.agent - Reported flow run 'ef043a55-545c-4495-8a63-8263931c06e7' as crashed: Flow run infrastructure exited with non-zero status code -1.
that's interesting. the jobs are not being duplicated any more now that I set the job watch timeout to none. 15:03:38.869 | INFO | prefect.agent - Submitting flow run 'e545532e-230f-4910-827b-44f4e2cd92f5' 15:03:39.631 | INFO | prefect.infrastructure.kubernetes-job - Job 'flow-run-started-for-tx-csv-mock-376-csv-bhxnf': Pod has status 'Pending'. 15:03:39.695 | INFO | prefect.agent - Completed submission of flow run 'e545532e-230f-4910-827b-44f4e2cd92f5' 15:03:41.950 | INFO | prefect.infrastructure.kubernetes-job - Job 'flow-run-started-for-tx-csv-mock-376-csv-bhxnf': Pod has status 'Running'.
r

Ryan Peden

03/03/2023, 3:42 PM
Are you able to test this with Prefect 2.8.3? There was a small change to the job watch timeout in 2.8.4; at a glance, it doesn't look like it would cause this, but if trying 2.8.3 won't take you too long, that would help narrow down possible causes.