The context of this question is that we were having this issue with subflows, so we are trying to find other solutions:
Joao Moniz
06/21/2023, 2:35 PM
n
Nate
06/21/2023, 5:24 PM
just in general, flows calls from within flows are considered subflows and
run_deployment
is effectively calling a flow (just it could be on a different infrastructure / defined in a different storage location and is associated with an existing deployment)
Nate
06/21/2023, 5:25 PM
what issues are you having with subflows?
j
Joao Moniz
06/22/2023, 5:33 AM
Hi @Nate, thanks for you answer.
The issue we are having is that our flow has hundreds of async tasks that are being executed with
asyncio.gather
on EKS. In order to not overload the infra, we were limiting the number of tasks running concurrently by using Task Run Concurrency tags, but at some point the tasks being executed just stop and they get deadlock.
We thought it could be a lack-of-resource issue situation, but even after we increased the memory/cpu, they got deadlock without being constrained by available resources.
From the docs we've found this piece of info below, so that's why we were asking about subflows.
Joao Moniz
06/22/2023, 10:02 AM
Some other things we've tried to do and didn't work:
• Deploy only a main flow, without subflows
• Instead of using
asyncio.gather
, we went with prefect concurrent method (using
task.submit
)
• Using S3 to persist task results
But somehow they keep getting stucked 😕
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.