Hey guys, having trouble with running a flow on Fa...
# prefect-community
d
Hey guys, having trouble with running a flow on FargateTaskEnvironment. Configuration for the Environment is below..
Copy code
flow.environment = FargateTaskEnvironment(
        launch_type="FARGATE",
        region="eu-west-1",
        cpu="256",
        memory="512",
        networkConfiguration={
            "awsvpcConfiguration": {
                "assignPublicIp": "ENABLED",
                "subnets": ["subnet-X"],
                "securityGroups": ["sg-Y"],
            }
        },
        family="my_flow",
        taskRoleArn="arn:aws:iam::X:role/CommonSuperRole",
        executionRoleArn="arn:aws:iam::X:role/CommonSuperRole",
        containerDefinitions={
            "name": "my-flow",
            "image": "my-flow",
            "command": [],
            "environment": [],
            "essential": True,
        }
    )
I keep getting this error:
An error occurred (ClientException) when calling the RegisterTaskDefinition operation: Fargate requires task definition to have execution role ARN to support log driver awslogs
So my questions : • I’m using the same Uber Role for both the taskRoleArn and the executionRoleArn, probably not best practice but should work? • I’ve thrown every possible log and cloudwatch related policy/permission at it that I can think of, but nothing is taking. I can provide a dump of the permissions if need be? Any help massively appreciated, fairly stumped on it. Is there any way to get more debug info out of it?
j
hi @Darragh We run the Fargate agent for a bunch of production Prefect Flows. Usually task role and execution role are different, with execution role almost always being the standard one named:
ecsTaskExecutionRole
It looks like yours are set to the same role. See this in AWS docs: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_execution_IAM_role.html
upvote 1
(FWIW, I find the AWS naming of these two roles to be a disaster. Specifically, execution role (not task role) is almost always set to something (
ecsTaskExecutionRole
) that has both of those words in the name!?)
d
AWS roles and policies are a wild west all unto themselves 🙂 I followed that guide but no joy. One interesting line in it though,
The task execution role is supported by Amazon ECS container agent version 1.16.0 and later
No idea if there’s any correlation or not.
Is there any way to get more debug info out of it?
j
@Darragh could you try these 2 things (should be quick): 1. Confirm that your AWS account has a role for
ecsTaskExecutionRole
2. Update your FargateTaskEnvironment to use the
ecsTaskExecutionRole
for execution role and your own role for task role, like this:
Copy code
task_role_arn="arn:aws:iam::<your-aws-account-number>:role/<your-aws-iam-role-name>",
    execution_role_arn="arn:aws:iam::<your-aws-account-number>:role/ecsTaskExecutionRole",
d
Testing now…
Same error
python snippet from the flow:
Copy code
flow.environment = FargateTaskEnvironment(
        launch_type="FARGATE",
        region="eu-west-1",
        cpu="256",
        memory="512",
        networkConfiguration={
            "awsvpcConfiguration": {
                "assignPublicIp": "ENABLED",
                "subnets": ["subnet-RANDOM"],
                "securityGroups": ["sg-RANDOM"],
            }
        },
        family="my_flow",
        taskRoleArn="arn:aws:iam::RANDOM:role/Orchestration-PrefectRole1E6EFC48-1J7FK5V772G96",
        executionRoleArn="arn:aws:iam::RANDOM:role/ecsTaskExecutionRole",
        containerDefinitions={
            "name": "batch-universe",
            "image": "batch-universe",
            "command": [],
            "environment": [],
            "essential": True,
        }
    )
Im now wondering is it possible to turn the logging and roles off just to get past this issue 😂
Or even how to turn on verbose logging for the agent
j
@Darragh I'm surprised that you're getting an error complaining about
"log driver awslogs"
-- we specifically add that to our containerDefinitions but you don't have that in your config. Here's ours:
Copy code
containerDefinitions=[
        {
            "command": [],
            "environment": [
                {"name": "PREFECT__LOGGING__LOG_TO_CLOUD", "value": "true"},
                {"name": "AWS_DEFAULT_REGION", "value": REGION_NAME},
            ],
            "essential": True,
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "blah-blah",
                    "awslogs-region": "our region",
                    "awslogs-stream-prefix": "blah blah",
                },
            },
        }
    ],
d
@Joe Schmid We might be looking at different definitions here - mine is for the flow definition, and i think yours is for the agent, is that right?
Agent definition here..
j
@Darragh that makes sense now. I'd recommend getting a Flow that uses a simple RemoteEnvironment() to work first. (If you haven't already.)
I don't see where your agent is getting execution role ARN and task role ARN. Is that coming from environment variables? You can try the same configuration I mentioned above using
ecsTaskExecutionRole
for execution role and your
CommonSuperRole
for task role with the agent. (Ignore me if your Fargate Agent is already working fine.)
d
@Joe Schmid Yeah i have a basic one working for remote, but our major use case is on fargate, im going to test the roles tomorrow on a manual fargate task deploy and see if they work at al!
Are the roles suppsed to be set on the agent? I thoight they were only on the flow?
j
Those roles definitely need to be set for the agent.
d
Ah, from what i read i thought they were on the flow only! Ok ill test that, thanks joe
j
No problem, post here with what you find and we'll get you up and running soon!
(Fargate Agent and FargateTaskEnvironment both register task definitions with ECS and run ECS tasks. The Fargate Agent will make a new ECS task -- and corresponding task definition -- for each Flow run. The FargateTaskEnvironment would make an additional ECS task.)
For us, we run Flows with the Fargate Agent that either: 1. Just uses a simple RemoteEnvironment() with no parameters so that the Flow run occurs in the ECS task that the Fargate Agent creates 2. Uses DaskCloudProviderEnvironment to create a truly distributed Dask cluster with ECS, i.e. Dask scheduler and Dask workers run as independent ECS tasks I would encourage anyone NOT to start with DaskCloudProviderEnvironment as it adds complexity. Just get #1 working successfully first then scale up if needed.
d
Thanks joe! So i might not actually need the fargate task envrionmwnt at all, is that right?
Added the ARNs to both the flow and the agent, and hey presto, a new error!!
An error occurred (ClusterNotFoundException) when calling the RunTask operation: Cluster not found
j
So i might not actually need the fargate task envrionmwnt at all, is that right?
That’s correct. There’s certainly no requirement to use it if you’re using the Fargate Agent.
a new error!!
An error occurred (ClusterNotFoundException) when calling the RunTask operation: Cluster not found
That’s definitely progress! Since the error is in RunTask, your Fargate Agent was able to successfully register a task definition. (And is now trying to run it.) Since your
launch_type
is
FARGATE
I think it should use the default cluster in ECS if you don’t specify one, but maybe try specifying a cluster ARN for your Fargate Agent. You also mentioned “Added ARNs to both the flow and the agent” — if the Flow is still using FargateTaskEnvironment, I would switch that to RemoteEnvironment().
l
For people who get the
An error occurred (ClientException) when calling the RegisterTaskDefinition operation: Fargate requires task definition to have execution role ARN to support log driver awslogs
error message and might be searching for help in the channel later. Make sure that you spelled
taskRoleArn
and
executionRoleArn
in camelCase and not in snake_case. Just spent a good billion hours looking for the source of the problem everywhere else but in my spelling lol
m
@Darragh did you get around the
Cluster not found
error? I am facing the same
d
Hi Maikel, we actually moved away from it for a while due to other problems 🙂 Hoping to get back to it soon and fixing up our remaining problems
m
Thanks Darragh. If anyone here has seen this error before (and fixed it) can you please share the solution ? 🙂
An error occurred (ClusterNotFoundException) when calling the RunTask operation: Cluster not found