Question: We are looking for a pipeline orchestrat...
# best-practices
j
Question: We are looking for a pipeline orchestrator and I have been testing prefect. I had a very good experience following the deployment docs, but one thing has me a bit confused; I was testing out the flow versioning feature by sending a TERM signal to the agent while it was running a flow, and it seems this leaves the flow in running state indefinitely (or pending, depending on when I interrupt the agent). I am following https://docs.prefect.io/concepts/work-queues/ for the agent setup, and what I see seems similar to the issue described here https://github.com/PrefectHQ/prefect/issues/2834, so I went looking for docker or kubernetes agents. However, I'm not finding anything for prefect v2; Am I doing something horribly wrong? What is the best practice with regards to handling hanging flows due to issues on the agent side?
đź‘€ 1
âś… 1
b
Hello Jonas, here is a good article for deploying prefect agents on kubernetes
In regard to best practices for handling hanging flows, I will need to do a bit of research and get back to you!
j
Thanks! I followed a link in that article and found a hint regarding docker/kubernetes here: https://discourse.prefect.io/t/prefect-1-0-prefect-2-0-kubernetes-run-config-to-kubernetesjob-infrastructure-block/1229. Namely
prefect deployment build
takes a cli arg
--infra
which can be set to docker. In this case, I think it is safe for me to assume that issue #2834 that I linked above is an exact match for the behaviour I am seeing, right?
b
Hi Jonas, sorry for the delayed response. Yes, I do believe that the issue you linked is a match to the behavior that you are seeing. If you're inclined to do so, I'd recommend opening a new issue in GitHub or commenting on the original post with a description of what you're encountering.
In regards to best practice with handling hanging flows, I've seen that a proactive approach tends to fair better than a reactive approach. Proactive: ensuring that ample computational resources are available in your k8s infrastructure to prevent flows from hanging. Reactive: checking the agent logs / container logs to see what caused the job to be interrupted/hang. A lot of a documentation for this exists for Prefect 1.0, but more documentation will be available 2.0 going in to the future (definitely continue to check Discourse for this). Stay tuned!
j
got it, thanks for getting back!