Myles Steinhauser
03/28/2022, 8:36 PMFlows
? Specifically, I’m trying to workaround some delayed scaling issues with ECS using EC2 instances (not ECS with Fargate tasks)
Often, this failure is reported back to Prefect like the following error until Capacity Provider scaling has caught up again:
FAIL signal raised: FAIL('a4f09101-0577-41ce-b8b0-31b84f26d855 finished in state <Failed: "Failed to start task for flow run a4f09101-0577-41ce-b8b0-31b84f26d855. Failures: [{\'arn\': \'arn:aws:ecs:us-east-1:<redacted>:container-instance/a8bc98b7c6864874bc6d1138f758e8ea\', \'reason\': \'RESOURCE:CPU\'}]">')
I’m using the following calls to launch the sub-flows (as part of a larger script):
flow_a = create_flow_run(flow_name="A", project_name="myles")
wait_for_flow_a = wait_for_flow_run(flow_a, raise_final_state=True, stream_logs=True)
Anna Geller
03/28/2022, 8:39 PMMyles Steinhauser
03/28/2022, 8:39 PMAnna Geller
03/28/2022, 8:44 PMMyles Steinhauser
03/28/2022, 8:46 PMAnna Geller
03/28/2022, 8:52 PMfrom prefect import Flow
from prefect.tasks.prefect import StartFlowRun
from datetime import timedelta
start_flow_run = StartFlowRun(project_name="PROJECT_NAME", wait=True, max_retries=10, retry_delay=timedelta(minutes=5))
with Flow("FLOW_NAME") as flow:
staging = start_flow_run(flow_name="child_flow_name")
and the retry_delay
timedelta would respect the time set on your scaling policy (e.g. if it the scale out takes 3-4 minutes then retry of 5 min can make sense)
does it make sense?Myles Steinhauser
03/28/2022, 9:00 PMStartFlowRun
(Task-based) to create_flow_run
Anna Geller
03/30/2022, 10:21 AMMyles Steinhauser
03/30/2022, 12:46 PM