https://prefect.io logo
Title
l

Lana Dann

11/23/2021, 7:45 PM
Hi there! I’m attempting to set up an ECS Agent as an ECS Fargate service but I need to scope the IAM permissions to the minimum requirements. I was wondering if someone could explain why we need these permissions so that I can figure out how to scope them out:
"ec2:AuthorizeSecurityGroupIngress",
"ec2:CreateSecurityGroup",
"ec2:CreateTags",
"ec2:DescribeNetworkInterfaces",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeVpcs",
"ec2:DeleteSecurityGroup"
a

Anna Geller

11/23/2021, 8:08 PM
@Lana Dann there would be a different IAM role you would need to set up for the ECS execution role, and a different IAM role for task role. This documentation includes both: https://docs.prefect.io/orchestration/agents/ecs.html#execution-role-arn Additionally, this blog post provides a walkthrough on how you can set this all up: https://towardsdatascience.com/how-to-cut-your-aws-ecs-costs-with-fargate-spot-and-prefect-1a1ba5d2e2df When it comes to explaining why specific permissions are needed: • Prefect agent needs to have permissions to register a task definition, run an ECS task, send logs to Cloudwatch and deregister task definition, and to do all that, some additional permissions are needed e.g. EC2 describe VPCs to lookup your VPC and subnet IDs required by ECS run task API • Additionally, if your flows use ECR for Docker images, or S3 for storage (and possibly results), then you need S3 permissions too. Both the docs and the blog post provide more information on that. Additionally, if you’re really interested, you could have a look at the ECSAgent source code to understand what exactly it is doing and why specific permissions are needed. We have also recently updated the ECSRun run configuration docs - it has many examples that hopefully explain the process and various possible configurations: https://docs.prefect.io/api/latest/run_configs.html#ecsrun
l

Lana Dann

11/23/2021, 8:11 PM
thanks! quick followup q. we’d like to use datadog and not cloudwatch for our logging and we’ve set up our container definitions to do so. in that case, can i remove all
logs:*
permissions from the task policy? (also to clarify i don’t have any issues using the provided execution role policy, i just need to scope down the task role policy)
🙌 1
also i’ve been following your blog post and wanted to say thank you for writing it!!
:thank-you: 1
also, i understand the intuition behind needing to describe vpcs, subnets, etc. but in what situation would the agent need to create or delete a security group?
a

Anna Geller

11/23/2021, 8:17 PM
Regarding sending ECS logs to datadog, I think you could reach out to AWS and ask about it. This should be possible, because you can configure ECS log driver to be splunk, so there must be an option to choose Datadog instead: https://aws.amazon.com/premiumsupport/knowledge-center/ecs-task-fargate-splunk-log-driver/ Your execution IAM role may look different then based on how you configure your log driver e.g. you may need a permission to retrieve a secret to authenticate with this external logging service.
You’re right, you can remove the “ec2:DeleteSecurityGroup” as long as you don’t need a Dask cluster that you would set up on ECS. Some of those permissions there are not 100% stripped to the absolute vanilla case, you’re right about that. For example, you can have a look at the IAM role required by DaskCloudProvider - it needs to delete a security group if it creates one as part of a Dask cluster deployment: https://cloudprovider.dask.org/en/latest/_modules/dask_cloudprovider/aws/ecs.html#FargateCluster You can use the task role provided as the agent role policy in the docs and remove one by one to see whether it affects your flows if you want absolutely 100% least privilege principle here.
l

Lana Dann

11/23/2021, 10:07 PM
thank you for your help, i got the agent running! i have a followup question on how we can use our ECS agent and the ECS flow run. we have ECS tasks defined that have a command that kicks off a script that does the actual data work that we’re trying to get done. from my understanding we can use
ECSRun
and just pass in the ARN of the task definition and register our flow of one task (and that one task is just running the ECS task). but what if i want to define a dependency between two separate ECS tasks with separate task definitions? when one finishes, i’d like to run the second. also, how can i create a flow that uses two different agents depending on the task?
a

Anna Geller

11/23/2021, 10:13 PM
@Lana Dann you can set both - the custom task_definition, as well as a custom label on the run configuration. And you can modify a run configuration of a child flow from a parent flow. Let me give you an example
here is the example - note that even though the parent flow runs in prod, it runs a child flow on a “dev” agent due to labels. Also the run config is different between parent and flow:
from prefect import Flow
from prefect.tasks.prefect import create_flow_run, wait_for_flow_run
from prefect.storage import S3
from prefect.run_configs import ECSRun

STORAGE = S3(
    bucket="your_bucket_name",
    key="flows/parent_flow.py",
    stored_as_script=True,
    local_script_path="parent_flow.py",
)


with Flow(
    "parent_flow",
    storage=STORAGE,
    run_config=ECSRun(
        labels=["prod"],
        task_definition_path="<s3://bucket/flow_task_definition.yaml>",
        run_task_kwargs=dict(cluster="prefectEcsCluster"),
    ),
) as flow:
    child_flow_id = create_flow_run(
        flow_name="child-flow-name",
        project_name="child-flow-project",
        run_config=ECSRun(
            labels=["dev"],
            task_role_arn="arn:aws:iam::XXXX:role/prefectTaskRole",
            execution_role_arn="arn:aws:iam::XXXX:role/prefectECSAgentTaskExecutionRole",
            image="<http://XXXX.dkr.ecr.us-east-1.amazonaws.com/image_name:latest|XXXX.dkr.ecr.us-east-1.amazonaws.com/image_name:latest>",
            run_task_kwargs=dict(cluster="prefectEcsCluster"),
        ),
    )
    wait_task = wait_for_flow_run(child_flow_id, raise_final_state=True)
the last task ensures to wait until the child flow finishes - after that you could trigger another task or child flow depending on your use case. If you want more ECS examples, check out this docstring: https://docs.prefect.io/api/latest/run_configs.html#ecsrun