hi! i've spent hours and hours debugging my config...
# prefect-community
h
hi! i've spent hours and hours debugging my configuration with AWS ECS Fargate encountering
ClusterNotFound
but haven't gotten anywhere. i'd immensely appreciate it if anyone would have any insights! details in thread
1
i'm running an agent on an ECS cluster named
etl
with the config below
Copy code
{
  "family": "prefect-agent",
  "requiresCompatibilities": [
    "FARGATE"
  ],
  "networkMode": "awsvpc",
  "cpu": "512",
  "memory": "1024",
  "taskRoleArn": "arn:aws:iam::685847091493:role/PrefectAgent",
  "executionRoleArn": "arn:aws:iam::685847091493:role/PrefectAgent",
  "containerDefinitions": [
    {
      "name": "prefect-agent",
      "image": "prefecthq/prefect:1.2.2-python3.8",
      "essential": true,
      "command": [
        "prefect",
        "agent",
        "ecs",
        "start"
      ],
      "environment": [
        {
          "name": "PREFECT__CLOUD__API_KEY",
          "value": "<REDACTED>"
        },
        {
          "name": "PREFECT__CLOUD__AGENT__LABELS",
          "value": "['etl']"
        },
        {
          "name": "PREFECT__CLOUD__AGENT__LEVEL",
          "value": "DEBUG"
        },
        {
          "name": "PREFECT__CLOUD__API",
          "value": "<https://api.prefect.io>"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/prefect-agent",
          "awslogs-region": "ap-southeast-1",
          "awslogs-stream-prefix": "ecs",
          "awslogs-create-group": "true"
        }
      }
    }
  ]
}
my prefect ECS agent service looks like this
Copy code
aws ecs create-service \
    --service-name prefect-agent \
    --task-definition prefect-agent:1 \
    --desired-count 1 \
    --launch-type FARGATE \
    --platform-version LATEST \
    --cluster "arn:aws:ecs:<MY_PROJECT>:cluster/etl" \
    --network-configuration "awsvpcConfiguration={subnets=[subnet-abcd],securityGroups=[sg-abcdef],assignPublicIp=ENABLED}"
i've verified that my agent is running and can connect to the prefect cloud
my flow looks extremely simple
Copy code
@prefect.task
def do_nothing():
    prefect.context.get("logger").info("Testing")


with prefect.Flow(
    "test-flow",
) as flow:
    do_nothing()

flow.run_config = ECSRun(
    cpu="0.25 vcpu",
    memory=128,
    image="prefecthq/prefect:1.2.2-python3.10",)
flow.register("TEST", labels=["etl"], set_schedule_active=False)
but whenever i try to launch a flow run from the prefect cloud UI, i keep seeing
Copy code
(ClusterNotFoundException) when calling the RunTask operation: Cluster not found.
so right before the call of
RunTask
is made by
python3.10/site-packages/botocore/client.py(462)
(im collecting the following info by running a local ECS prefect agent on python3.10), i intercepted and printed out the
kwargs
that are sent to AWS API as below
Copy code
{
    'taskDefinition': 'arn:aws:ecs:<MY_REGION>:<MY_PROJECT>:task-definition/prefect-test-flow-626d9082-a97c-44e5-ba92-1fc0fea52153:1', 
    'networkConfiguration': {
        'awsvpcConfiguration': {
            'subnets': ['subnet-0109eb9c40d7abca8', 'subnet-06e530acd78c534d5', 'subnet-0d2f61393fee4e563'], 
            'assignPublicIp': 'ENABLED'
        }
    }, 
    'launchType': 'FARGATE', 
    'overrides': {
        'containerOverrides': [{
            'name': 'flow', 'command': ['/bin/sh', '-c', 'prefect execute flow-run'], 
            'environment': [
                {'name': 'PREFECT__LOGGING__LEVEL', 'value': 'INFO'}, 
                {'name': 'PREFECT__CLOUD__USE_LOCAL_SECRETS', 'value': 'false'}, 
                {'name': 'PREFECT__ENGINE__FLOW_RUNNER__DEFAULT_CLASS', 'value': 'prefect.engine.cloud.CloudFlowRunner'}, 
                {'name': 'PREFECT__ENGINE__TASK_RUNNER__DEFAULT_CLASS', 'value': 'prefect.engine.cloud.CloudTaskRunner'}, 
                {'name': 'PREFECT__BACKEND', 'value': 'cloud'}, 
                {'name': 'PREFECT__CLOUD__API', 'value': '<https://api.prefect.io>'}, 
                {'name': 'PREFECT__CONTEXT__FLOW_RUN_ID', 'value': '626d9082-a97c-44e5-ba92-1fc0fea52153'}, 
                {'name': 'PREFECT__CONTEXT__FLOW_ID', 'value': '56d0b1a5-e7a3-4a2b-bb58-8a0a24632cde'}, 
                {'name': 'PREFECT__CLOUD__SEND_FLOW_RUN_LOGS', 'value': 'true'}, 
                {'name': 'PREFECT__CLOUD__API_KEY', 'value': '<REDACTED>'}, 
                {'name': 'PREFECT__CLOUD__TENANT_ID', 'value': '4079fb25-7df9-4f21-b43e-e1ccf5d8d9fc'}, 
                {'name': 'PREFECT__CLOUD__AGENT__LABELS', 'value': "['etl']"}, 
                {'name': 'PREFECT__LOGGING__LOG_TO_CLOUD', 'value': 'true'}, 
                {'name': 'PREFECT__CLOUD__AUTH_TOKEN', 'value': '<REDACTED>'}
            ]}], 
        'taskRoleArn': 'arn:aws:iam::<MY_PROJECT>:role/PrefectRunner', 
        'executionRoleArn': 'arn:aws:iam::<MY_PROJECT>:role/PrefectRunner', 
        'cpu': '0.25 vcpu', 
        'memory': '128'}
    }
the
PrefectRunner
and
PrefectAgent
right now has the exact same policies (rather redundant i must say), so i think it's not a permissions issue
a
e.g. you may explicitly define your cluster name
Copy code
flow.run_config = ECSRun(
    labels=["prod"],
    task_definition_path="<s3://bucket/flow_task_definition.yaml>",
    run_task_kwargs=dict(cluster="prefectEcsCluster"),
)
h
unfortunately that would yield
An error occurred (InvalidParameterException) when calling the RunTask operation: No Fargate configuration exists for given values.
right now im letting the agent auto-create the task definition - might it be a source of issue?
it's also highly probably that my ECS configuration is really wrong somehow - so i was also hoping that someone with a lot of experience like you could help me spot if something seems odd
okay... i think by removing
cpu
and
memory
it starts to work...
alright this is resolved. sorry for the noise! hope it'll be helpful to some people seeing the same issues too. settings resources to be too limited can sometimes makes fargate angry!
a
Nice work and thanks for the update!