Hi Prefect team, We recently reconfigured our dep...
# ask-community
r
Hi Prefect team, We recently reconfigured our deployments to run multiple times from a master flow which uses an ETL template. We use AWS infrastructure and found that this reconfiguration works fine on Fargate instances but mysteriously crashes on EC2 instances. These flows were previously working just fine on EC2 (and Fargate) when instantiated as one-deployment-per-flow.
Copy code
21:53:15.511 | INFO    | prefect.agent - Submitting flow run '7bec67b1-ddba-429a-a3a6-dc071401433b'
21:53:17.265 | INFO    | prefect.infrastructure.ecs-task - ECSTask 'pretty-mussel': Retrieving task definition 'arn:aws:ecs:us-east-1:421396523132:task-definition/zarr-etl-ec2-efs:44'...
21:53:17.517 | INFO    | prefect.infrastructure.ecs-task - ECSTask 'pretty-mussel': Retrieving task definition 'zarr-etl-ec2-efs'...
21:53:17.540 | WARNING | prefect.infrastructure.ecs-task - ECSTask 'pretty-mussel': Settings require changes to the linked task definition. A new task definition will be registered.
21:53:17.603 | INFO    | prefect.infrastructure.ecs-task - ECSTask 'pretty-mussel': Creating task run...
21:53:17.726 | ERROR   | prefect.agent - Failed to submit flow run '7bec67b1-ddba-429a-a3a6-dc071401433b' to infrastructure.
Traceback (most recent call last):
  File "/root/.local/lib/python3.10/site-packages/prefect/agent.py", line 490, in _submit_run_and_capture_errors
    result = await infrastructure.run(task_status=task_status)
  File "/root/.local/lib/python3.10/site-packages/prefect_aws/ecs.py", line 621, in run
    ) = await run_sync_in_worker_thread(
  File "/root/.local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(
  File "/root/.local/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/root/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/root/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/root/.local/lib/python3.10/site-packages/prefect_aws/ecs.py", line 814, in _create_task_and_wait_for_start
    self._report_task_run_creation_failure(task_run, exc)
  File "/root/.local/lib/python3.10/site-packages/prefect_aws/ecs.py", line 810, in _create_task_and_wait_for_start
    task = self._run_task(ecs_client, task_run)
  File "/root/.local/lib/python3.10/site-packages/prefect_aws/ecs.py", line 1487, in _run_task
    return ecs_client.run_task(**task_run)["tasks"][0]
IndexError: list index out of range
We would like to avoid a one-deployment-per-flow setup if at all possible as it’s considerably clunkier code-wise and on the Cloud interface. We must run some of our ETLs on EC2 however. We encountered this error running on prefect
2.10.10
with flow runs using ECSTasks. Any ideas? Note that this appears a repeat of a previous, unanswered issue in this channel https://linen.prefect.io/t/10301091/hi-i-am-running-a-deployment-multiple-times-from-a-master-fl. @Tibs did you ever find a solution?
👍 1
z
This looks like when we sent the
run_task
request to ECS it did not return a created task
Is this consistent or intermittent?
r
@Zanie we’re exploring the possibility that AWS’s EC2 Autoscaling Group is not, in fact, scaling and therefore there are no EC2 tasks available. Will report back.
t
@Robert Banick -- did you ever discover the root cause here? We're running into a similar issue.
r
@Ted Collins it was a problem with the EC2 Autoscaling Group not having enough available instances. I haven’t had time to sit down with our DevOps guy and write up more formally what the issue was and how it was resolved, but in short we bumped up the instance numbers and the problem went away.
t
Thanks for the info @Robert Banick