Jack
07/30/2024, 9:14 PMrun_deployment
to run sub-flows on their own pod/node when the parent flow is deployed on my Kubernetes work pool (in EKS). However, I'm noticing when the cluster scales up to support the scheduling of these new pods/nodes, prefect thinks these sub-flows crashed (when eventually they start and complete successfully) causing the parent flow to move on when it should be blocked.
I'm just looking for a solid way for sub-flows to be scheduled on their own pods/nodes (due to heavy resources required).
Is there a preferred way to do this? I've scoured the docs/discourse/faq to no avail...
Here's a torn-down version of a parent-flow i'm working on:
@flow(log_prints=True)
async def my_flow(config_path: str, local=False):
my_things = [ . . . ]
if local:
# local invocation -> process things sequentially
for thing in things:
await my_subflow(config_path, thing.id)
else:
# deployed invocation -> each subflow gets its own pod
await asyncio.gather(
*[
run_deployment(
name="my-deployed-subflow",
parameters={"config_path": config_path, "id": thing.id},
)
for thing in things
]
)
Nate
07/30/2024, 9:20 PMHowever, I'm noticing when the cluster scales up to support the scheduling of these new pods/nodes, prefect thinks these sub-flows crashed (when eventually they start and complete successfully) causing the parent flow to move on when it should be blocked.it sounds like something is amiss here, whether due to the kubernetes worker implementation or resource allocation in your cluster somehow - do you have any more info you can offer about what you're seeing here?
Jack
07/30/2024, 9:31 PMJack
07/30/2024, 9:33 PMJack
07/30/2024, 10:09 PMNate
07/30/2024, 10:17 PMJack
07/30/2024, 10:34 PMBring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.
Powered by