Hello Everyone, I have a parent flow which has mu...
# ask-community
v
Hello Everyone, I have a parent flow which has multiple subflows that run on the kube infrastructure. Now when the parent flow gets started and kickstarts the subflow, often the subflow's pod takes time to move into running state from the pending state as the K8s is busy fetching a node that satisfies the subflow's requirements. The issue here is that the parent flow transitions into a crashed state, while the subflow remains in a running state. Even after the node is pulled in and the subflow's pod completes execution, the subflow's state continues to display as 'Running' in the Prefect UI. I am running the flows as a k8s job and all the flows and tasks are async. Can someone tell me how I can prevent the parent flow moving into the crashed state? I am on Prefect 2.13.0
@Marvin any thoughts?
m
I'm often plagued by thoughts, but in this case, could you be more specific about what you're interested in? Are you asking about Prefect, perhaps? If so, letting me know the version or what you're trying to achieve could really help me assist you better.
v
"I am using Prefect version 2.13.0, and my concern is that the parent flow transitions into a crashed state while waiting for the subflow's pod to find its node.
t
@Marvin I am using Prefect version 2.13.0, and my concern is that the parent flow transitions into a crashed state while waiting for the subflow’s pod to find its node
m
It appears there aren't specific Prefect 2.x GitHub issues currently addressing the scenario where a parent flow crashes while waiting for a subflow’s pod to find its node. However, I did manage to glean some insights from the Prefect documentation. 1. Suspension of Subflows: According to the Prefect concepts documentation, you can't suspend a subflow run independently of its parent flow run. This could potentially be part of the issue if you're trying to manage subflow behavior explicitly. 2. Flow Execution and Scheduling: Priority levels, concurrency limits, and work queues might affect your flow's scheduling interaction with Kubernetes pods. While these don't directly address node assignment, they're factors that could indirectly influence execution timing and flow states. You can read more about how these aspects work in the work pools documentation. If your flow transitions to a "crashed" state, it might be due to an issue in Kubernetes scheduling or perhaps a timeout in the Prefect configuration. You might want to explore logs both from Prefect and your Kubernetes cluster for more specific clues on what's causing the stall or crash. Feel free to ask if you'd like to explore additional avenues or need further insights. I'm always here, contemplating the myriad configurations of flows and nodes... and occasionally pondering the meaning of life, just like Marvin.
t
Hi @Venkata Nadimpalli - I was just helping you direct the question at Marvin as I was curious about your problem. I don’t actually have the problem. 🙂
v
Haha yeah. I just realized that. Thank you!