https://prefect.io logo
j

Jean-Michel Provencher

08/03/2023, 5:10 PM
Out of curiosity, anybody has even able to launch a Flow in Kubernetes using a POD instead of a JOB ?
@Jeff Hale maybe?
j

jawnsy

08/03/2023, 5:16 PM
Can you describe your use case a bit more?
j

Jean-Michel Provencher

08/03/2023, 5:17 PM
The problem we face the more as an organization, is that prefect does not pair really well with Kubernetes autoscaling. When a pod is moved to another node, the flow will be marked as crashed, then the pod will start on another node (for the same flow) and then eventually succeed. When this happens, we don't have visibility very well on this in the UI, other than the fact that the run count will be 2 for that flow.
j

jawnsy

08/03/2023, 5:18 PM
you might be able to specify the template as a Pod instead of Job: https://prefecthq.github.io/prefect-kubernetes/worker/ however, I don’t think we’ve tested that, so I wouldn’t rely on that behavior
pods shouldn’t be restarting with our default restartPolicy of Never
j

Jean-Michel Provencher

08/03/2023, 5:18 PM
yeah exactly, I read the code and there would be edge cases
however it's still restarting because kubernetes does not consider moving a pod as a failure
🤔 1
I validated in their code... that's the problem 😕
j

jawnsy

08/03/2023, 5:31 PM
Well, this definitely seems like a useful feature for us to consider! Would you mind searching our repo if there’s an issue, and creating one if not? Either in prefect or prefect-kubernetes seem like good places
j

Jean-Michel Provencher

08/03/2023, 7:22 PM
here's an example.
A few things seems to be tied to the JOB concept
I can try on my side seeing if that works
Submission failed. ValidationError: 1 validation error for KubernetesJob job Job has incompatble values for the following attributes: /kind must have value 'Job' (type=value_error) See Less
j

jawnsy

08/03/2023, 7:39 PM
I think it needs code changes, but I think those code changes make sense for us to evaluate… I’m not sure when we can prioritize it though unless a lot of people are facing similar issues
we’d be happy to review pull requests though, and some of the team may be able to give you some direction on how to implement
j

Jean-Michel Provencher

08/03/2023, 7:40 PM
We are a paying customer so I'm sure this can weight in the balance 😄
It think offering both is a good idea. The advantage of pods would be that you only launch one pod and get rid of any internal kubernetes retry mecanism and things like that
j

jawnsy

08/03/2023, 8:09 PM
yup, but pods can confuse autoscalers, I think that I’ve seen some controllers avoid scaling down clusters because pods didn’t have a controller associated with it
in theory we could turn the Prefect Kubernetes worker into a Kubernetes controller, which would then own pods that it creates. but we currently write the worker in Python and the Kubernetes Python SDK is not very complete (relative to the Go SDK)
j

Jean-Michel Provencher

08/04/2023, 4:55 PM
ok thanks for the info!