https://prefect.io logo
m

Michael Gardner

09/12/2023, 9:37 PM
I am using the ecs:push work pool. When I send a concurrent batch of 100 flows to be ran it is only sending around 80 tasks to my AWS cluster at once. It then just stalls and randomly sends a few tasks with significant delays in between (15 min+). Where can I look either in prefect or AWS to see what the issue is? The AWS cluster as sufficient quota requirements to run up to 4000 vcpu.
j

Jake Kaplan

09/12/2023, 9:40 PM
hey! Can you send me your workspace id? It is definitely unexpected for there to be delay
m

Michael Gardner

09/12/2023, 9:43 PM
ea44d884-2e28-4ecd-a6cb-09b59418161f
Let me know if that isn't it
j

Jake Kaplan

09/12/2023, 10:01 PM
thanks so much! Additionally if you could provide a flow run id that you're seeing the 15minish delay that would be great as well and any info how you're submitting your concurrent batch of 100. Will investigate and get back to you!
m

Michael Gardner

09/12/2023, 10:07 PM
This flow run id never ran and is sitting as pending: c9909a38-4683-486e-9214-02cbbd6f98cc This flow run id appears to have ran successfully but is still saying pending in the ui: af8bb299-fba5-4dfd-9d8a-0dc5d54c3154 It appears 16 of the 100 flows i sent as a test are stuck and aren't being sent to my cluster for whatever reason.
j

Jake Kaplan

09/12/2023, 10:15 PM
Thanks! Thats very helpful. And how are you submitting them, with a script calling run_deployment or something similar?
m

Michael Gardner

09/12/2023, 10:19 PM
Yes,
Copy code
from prefect.deployments.deployments import run_deployment

run_deployment(
            name="Table-Of-Contents-Flow/Table-of-Contents-Deployment",
            timeout=0, 
        )
I am using timeout=0 based on the answer from this thread. https://discourse.prefect.io/t/how-can-i-run-multiple-subflows-or-child-flows-in-parallel/96/12