Hello again :smile: regarding the `RunNamespacedJo...
# prefect-community
t
Hello again šŸ˜„ regarding the
RunNamespacedJob
example (from https://github.com/anna-geller/packaging-prefect-flows/blob/master/flows_task_library/s3_kubernetes_run_RunNamespacedJob_and_get_logs.py ) --- we implemented it and got it to work, but it seems that itā€™s now failing on :
Copy code
VALIDATIONFAIL signal raised: VALIDATIONFAIL('More than one dummy pod')
because there seems to be many pod ā€œresiudesā€ of previous runs:
Copy code
['prefect-agent-7745fb9694-6fwk4', 'prefect-job-47d072a8-4pbsf', 'seg-pred-test-cm54l', 'seg-pred-test-doron', 'seg-pred-test-l2j5l', 'seg-pred-test-zvwld']
so wouldnā€™t k8s keep the pods around given that we gave a ā€œdelete_job_after_completionā€ = False ? and even if the job is deleted successfully, wouldnā€™t it keep the pods around? or are the pods supposed to be deleted automatically if the job is deletedā€¦?
k
I think this is a question for Anna but sheā€™s away from the computer so weā€™ll wait for her
a
tag me next time. Kevin šŸ˜› looking now
@Tom Klein
delete_job_after_completion
is for the Kubernetes job, not for a pod. A single job can result in many pods afaik In general, it's all configurable, you need to dig deeper into those Kubernetes tasks, Tom šŸ™‚ but happy to help you if you have trouble understanding those. Can you share the flow code for those Kubernetes tasks that seem confusing to you?
I believe only the seg-pred-test-* pods are those spun up by those Kubernetes tasks. ā€¢ prefect-agent-7745fb9694-6fwk4 is for the agent itself ā€¢ prefect-job-47d072a8-4pbsf is the flow run pod ā€¢ the rest are those from your Kubernetes-task-Kubernetes-jobs I think the pods are not deleted - that's why in the example you mentioned I used the task
DeleteNamespacedPod
to clean those up What's the end goal you try to achieve here? Do you want to keep those pods or delete those?
t
Our goal is to basically launch a job and then clean up after it but bring back the logs into the prefect UI so they're more easily visible to our data scientists We're not interested in keeping the pods around other than to extract the logs from them. But for whatever reason if we ever run into a situation where there's pods remaining, then : 1. The condition will fail 2. There no easy way to clean it up from within prefect anymore nor to know which pod belongs to the job we just created (right?)
a
Would you be able to share an example I could reproduce on my end? It's quite likely that just some tiny piece in your tasks configuration is missing. The use case you described is 100% doable with the existing Kubernetes tasks
t
@Anna Geller hmm, i can definitely share what we have now but since we constantly change the code (and since the current state of the k8s cluster is the result of multiple runs, some of them failed [and some due to unrelated reasons] ) iā€™m not sure the exact situation is easily reproducible just from code -- basically if you run your own (example) code multiple times but omit the last step of
delete
, you will find yourself in the current state we are in, right? where you have multiple pods that have not been deleted, and the validation (that checks if there are other pods that start with the same prefix) would fail if you try to run it again ā€¦ right? iā€™m basically wondering whatā€™s the right way to cope with this situation - we canā€™t just omit the validation (since then the logic of returning the ā€œfirstā€ pod that is found would make no sense, no one guarantees itā€™s necessarily the one that was just now created) - and i couldnā€™t really find other ways to get the pod-name for the job that was just created (itā€™s not returned by the
ReadNamespacedJob
task, for example, nor by the
RunNamespacedJob
itself)
a
Ok, so the actual problem you see is that when
RunNamespacedJob
fails, the
DeleteNamespacedJob
doesn't run, leaving a zombie pod undeleted, correct?
t
there is no
ReadNamespacedJob
(in your example), thereā€™s a
ListNamespacedPod
to list all the pods and then filter for the ones that have a name starting with the name of our job -- we tried the
ReadNamespacedJob
as an alternative way of maybe getting the pod name directlyā€¦ šŸ˜• but yes - what you wrote is correct and seems to be what actually happened - for whatever reason (doesnā€™t even really matter why) - there was some initial zombie pod, after which the process didnā€™t stop generating more of them (cause each time - even though it begins with a ā€œdelete_if_existsā€ for the job - that doesnā€™t actually remove the pod. Iā€™m not a k8s expert but it seems to be possible for the job to not exist anymore even though the pod does. - and when the validation fails, the pod that was just created for the newly created job --- still exists
or in other words: ā€¢ a single ā€œzombieā€ pod exists, no job to delete -> ā€¢ try to delete job (via
DeleteNamespacedJob
) fail since it doesnā€™t exist ā€¢ create and run new job ā€¢ list pods ā€¢ fail validation since now there are two pods with that name ā€¢ end delete step (of
DeleteNamespacedPod
) is not reached so a new zombie is created --- and even if it was reached, it would only take care of this current run, not of the other zombie --- but it canā€™t even do that since itā€™s unclear which of the two pods is ā€œourā€ pod (this runā€™s pod, that is) and now the process begins again with 2 zombies instead of 1, and so on.. maybe we need like an initial cleanup step that also tries to delete all zombie pods with that name? my worry is that itā€™s kind of a low-level fiddling with other runsā€¦ what if someone legitimately ran this flow more than once in parallel? all i want is to have a single job act as an ā€œatomicā€ operation (that cleans only after itself) ā€” i donā€™t mind if there are multiple such jobs running simultaneously etc.
a
this was a typo, I meant
RunNamespacedJob
āœ… 1
what you wrote is correct and seems to be what actually happened
you can solve it using triggers - adding
trigger=all_finished
, should ensure that the pod will get deleted even if
RunNamespacedJob
fails - add the same line to the delete task
t
but if we get to that line and there were two pods with that name to begin with, we wouldnā€™t know which one to delete, right?
a
then delete_if_exists? šŸ˜„
t
that step only deletes the job, not the pod
the point is that you have to know the specific name of the pod in order to delete it, and you canā€™t know which is ā€œyourā€ pod
a
I meant - using the same logic - first delete if exists, then try to run the pod
t
right but you wouldnā€™t know which pod to try and delete. letā€™s say thereā€™s:
seg-pred-123
and
seg-pred-456
as pods when you start to run. which one do you delete?
and in fact, i donā€™t want to delete existing jobs/pods. maybe someone is legitimately running a similar job in parallel? i just want this run to clean after itself, i donā€™t care about other runs
basically what weā€™re really missing is a way to identify the pod of the job we just created via
RunNamespacedJob
- without relying on there being only a single pod with that name prefix. in
kubectl
this is achieved with
describe job
apparently, or something. doesnā€™t seem possible via Prefect (or maybe it is and iā€™m missing something. thatā€™s what iā€™m asking)
a
this seems to be a cluster administration problem - perhaps you can have a separate job deleting pods that start with a given name? I can totally understand why it's beneficial to clean those up but I don't have a clear recipe other than this code example and using triggers I would recommend using triggers to solve this problem. This way, this use case with 2 pods with the same name shouldn't happen in the first place:
if we get to that line and there were two pods with that name to begin with, we wouldnā€™t know which one to delete, right?
trigger
all_finished
is the most reliable and cleanest approach I can recommend at this time
t
all_finished
will make sure the step runs, but if thereā€™s more than one pod with that name we wouldnā€™t know which of the two (or three, or four) is ā€œourā€ pod (the one that was generated by this specific prefect flow run) we definitely want to allow for more than one instance of the job to run in parallel. maybe the solution is to give the job a unique name per runā€¦ ? (that way thereā€™s a 1:1 relation between ā€œjobs with that nameā€ and ā€œcorresponding podsā€ ) the case of multiple pods can definitely happen if more than one person runs an instance of this flow
a
if thereā€™s more than one pod with that name
why would it be?
t
for example if there was a scheduled run of the flow, and someone else wanted to also run a manual run (with different parameters) of this flow ā€” while the scheduled run was still running
a
you could set a concurrency limit of 1 to avoid that if this is an actual problem
t
but we donā€™t want to limit. we want it to be possible to run this multiple times in parallel
i think weā€™re kind walking around the problem hereā€¦ the problem isnā€™t solved by limiting concurrency (especially when such concurrency is needed), but by being able to identify 1:1 the pod that was just created by the job we just ran. i guess giving the job a unique name (e.g.
my-cool-job-467fdfg5a
) solves that problem (since there would only ever be one pod that matches that unique name), i just donā€™t know if thatā€™s a ā€œbest practiceā€. Seems to me like itā€™s more of a workaround.
a
gotcha. I don't think that giving unique names to pods violates any best practices. And even if, if this solves your problem, go for it šŸ™‚
t
the pods have unique names by default i think, itā€™s the
job
that i donā€™t know if we want to give a unique name toā€¦ my knowledge of k8s is too limited to know if thereā€™s any important reasons to want jobs of a similar ā€œnatureā€ (i.e. image, code, whatever) to have the same nameā€¦ iā€™ll ask our devops anyway, it all would have been solved if we somehow could just get the name of the pod related to the job we just created via
RunNamespacedJob
ā€” i tried to use the
Read
task and the only thing that looked like an identifier was the
controller-uid
or something - but i dunno if that should/could be used as an indirect identifier for the pod
Copy code
{'api_version': 'batch/v1',
 'kind': 'Job',
 'metadata': {'annotations': None,
              'cluster_name': None,
              'creation_timestamp': datetime.datetime(2022, 4, 13, 16, 1, 56, tzinfo=tzlocal()),
              'deletion_grace_period_seconds': None,
              'deletion_timestamp': None,
              'finalizers': None,
              'generate_name': None,
              'generation': None,
              'labels': {'controller-uid': '35ca5ffe-8583-42bb-8c98-ec1d413bf7cc',
                         'job-name': 'seg-pred-test'},
              'managed_fields': [{'api_version': 'batch/v1',
                                  'fields_type': 'FieldsV1',
alternatively, we would need to give up our desire to ā€œpull backā€ the logs into Prefect UI and just rely on our own internal logging mechanism (and then we can set the job to delete resources after its finished).. this is what iā€™m trying right now:
Copy code
#del_job = delete_if_exists()
    k8s_job = create_and_run_job()
    #del_job.set_downstream(k8s_job)

    v1job = read_job()
    k8s_job.set_downstream(v1job)
    print_job_output = print_job(v1job)
    
    controller_uid = v1job['metadata']['labels']['controller-uid']
    pods = list_pods(kube_kwargs={"label_selector": {"controller-uid": controller_uid}})
    list_of_pods = get_pod_ids(pods)
    pod_name = get_our_pod_name(pods)
    delete_pod(pod_name)
a
Thanks for explaining the problem more. This is something that would need to be investigated in more detail, and perhaps some engineers from the Integrations team can chime in to help (maybe there's some tweak that can be added to those tasks to get the info you need). I'll be OOO for Easter so I can't dive deeper now, but let me open a Github issue for now. Feel free to continue adding any notes or continue the discussion on the GitHub issue. @Marvin open "How to get the name of the pod related to the job created via RunNamespacedJob?"
šŸ™ 1
āœ… 1
t
Thanks @Anna Geller - iā€™m also going on vacation for passover šŸ˜„
šŸ‘ 1