Hey everyone, my teammate <@U019ZNA00KC> and I are...
# prefect-server
e
Hey everyone, my teammate @Zaid Naji and I are trying to run flows using the Fargate agent. We're getting this error when we try to run a flow:
Copy code
An error occurred (InvalidParameterException) when calling the RegisterTaskDefinition operation: Invalid 'cpu' setting for task.
The entrypoint we're using for the agent is:
Copy code
["prefect","agent","start","fargate","cpu=256","memory=1024","networkConfiguration=$NETWORK_CONFIGURATION"],
Does anyone know what we should we be setting for
cpu
to get it working?
n
Hi @Emma Willemsma - I think that setting should be a string itself, so
cpu="256"
The same with
memory
e
Oh is this a docs issue then? We were following this as an example: https://docs.prefect.io/orchestration/agents/fargate.html#prefect-cli-using-kwargs
n
Ah it looks like that might be incorrect, can you give it a try with them as strings and report back? If so we can update the documentation
e
We're having a really hard time making this work. We're running the agent as a Fargate service, and we've tried a bunch of variants (with and without quotes and and escape characters) for the 
cpu
 parameter in the 
entryPoint
 and we keep getting the same error. So this for example isn't working:
Copy code
["prefect","agent","start","fargate","cpu=\"256\"","memory=\"1024\"","networkConfiguration=$NETWORK_CONFIGURATION"],
Has anyone gotten this to work?
👀 1
d
What are acceptable CPU values for Fargate? We’re on GCP but I’ll do my best to help
Looks like 256 is an acceptable value
And it’s not working as a number or as a string?
e
Yeah we've tried it both ways
j
Just jumping in here 🙂 did you happen to try
"cpu='256'"
?
s
Hi, I'm using the Fargate Agent for my runs. I found that using the ENV is the only way to configure it reliably. The Agent won't load some configurations from arguments due to oversight in the code. I resorted to using ENV for almost all of it. My entrypoint is
["prefect", "agent", "start", "fargate", "enable_task_revisions=true"]
With the environment set of:
Copy code
PREFECT__BACKEND: server
      PREFECT__CLOUD__AGENT__AGENT_ADDRESS: <http://127.0.0.1:8080>
      PREFECT__CLOUD__AGENT__LABELS: '["s3-flow-storage"]'
      executionRoleArn: ecs-task-execution-role
      memory: 512
      cpu: 256
      networkConfiguration: ${NETWORK_CONFIGURATION}
      taskRoleArn: prefect-agent-role
      containerDefinitions_logConfiguration: ${LOG_CONFIGURATION}
      cluster: ${CLUSTER_NAME}
e
Ah, good to know
Ok thanks, we'll give this a try!
🙏 1
j
I think I might know why this is failing from the CLI! It’s most likely due to some parsing mismatch. When passing in cpu and memory we don’t evaluate the literal value because the agent assumes it’s being passed in as a string. However when passing it in from the CLI entrypoint like above it seems as if it is interpreting it as an integer! (which is fine for every other kwargs except cpu and memory because their literal value is being interpolated) https://github.com/PrefectHQ/prefect/blob/master/src/prefect/agent/fargate/agent.py#L327 I’m going to put together a fix for this 🙂
😀 1
s
The entire
_parse_kwargs
stuff is rather difficult to follow 😓 I didn't make a PR because I couldn't quite understand the reason for the complexity (nor had the time to understand the flow).
j
Oh yes it is big time and we actually have a way cleaner path forward with a new RunConfig pattern we are introducing 🙂
🤝 2
e
@Spencer thanks for the help, we finally got our Fargate agent working using your suggestion 🙂
🙌 2
z
Hi thanks guys for the support. So we are running into an interesting issue. When passing the executionRoleArn and taskRoleArn to the agent, it will not pass them to the flow tasks it will pass its own task and execution roles to them. Is that intended? Given that the prefect agent is deployed on ecs itself as a service
s
It is intended; the task/execution roles for the flows are configured in the
FargateTaskEnvironment
that you attach to the flows
z
Ah prefect thank you 🙏
s
The task and execution roles that you configured before are solely for the flow boot task (which starts the flow run)
I found it to be a bit tricky to cleanly attach the environment to the flows; so I wrote an internal deployment library that will crawl the files, gather all the flows (essentially imports each file and pulls out all module attributes of type
prefect.Flow
) and update their environments to the proper
FargateTaskEnvironment
before registering them. Also, I am using the
S3Storage
alongside this. You can of course just set it directly on your flows; I just wasn't a fan of that configuration. I wanted my data engineers to not have to be aware of it.
z
Got it thanks. So we have to provision the flow roles separately
And attach them to the flow and register
s
Yeah
z
Cool thanks 🙏
s
The Fargate agent is a bit funny to host on ECS because its runtime will be running on ECS (polling the API). It will spawn a task on ECS that will download the configuration from the API; which will then spawn another task the will run your flow. Agent polling -> Intermediate task (flow run) -> Your flow executing
👍 1
z
Is there documentation to show us how to add the flow role arns before registering them?
s
You just specify them in the
FargateTaskEnvironment
constructor:
taskRoleArn
and
executionRoleArn
If using a custom docker image for the tasks, you need to specify it in the metadata field too
My FargateTaskEnvironment looks something like:
Copy code
FargateTaskEnvironment(
        # Task Definition
        family=task_definition_name,
        taskRoleArn=settings.task_role_arn,
        executionRoleArn=settings.execution_role_arn,
        cpu=settings.cpu,
        memory=settings.memory,
        containerDefinitions=[
            {
                "name": "flow-container",
                "image": "image",
                "command": [],
                "environment": [],
                "essential": True,
                "logConfiguration": {
                    "logDriver": "awslogs",
                    "options": {
                        "awslogs-group": settings.awslogs_group,
                        "awslogs-region": settings.awslogs_region,
                        "awslogs-stream-prefix": settings.awslogs_stream_prefix,
                    },
                },
            }
        ],
        networkMode="awsvpc",
        requiresCompatibilities=["FARGATE"],

        # Task Run
        cluster=settings.cluster_name,
        region=aws_settings.region,
        taskDefinition=task_definition_name,
        launch_type="FARGATE",
        networkConfiguration={
            "awsvpcConfiguration": {
                "assignPublicIp": "ENABLED"
                if settings.assign_public_ip
                else "DISABLED",
                "subnets": settings.subnets,
            }
        },
        metadata={"image": image},
    )
z
Thank you for the example. When we pass the container definition like this the fargate agent is not picking it up after registering the flow and it complains about missing parameters.
After checking your above comment on the ECS deployment of the agent now I get what’s happening. The initial env variables are for the task that pulls the flow config not the flow definition itself. Do you suggest a better deployment model for the fargate agent?
s
I host the Fargate Agent on Fargate myself 🤷‍♂️ it works for me
The containerDefinitions in the
FargateTaskEnvironment
get overridden by the agent when constructing the task IIRC
z
So in our use case, we need to pass different roles to the flow tasks. We don’t want the agent to forward its role to the flow tasks.
I guess it will only pass its own roles and that might not work for us
s
Oh wow, OK. Using a different role per flow would be a bit cumbersome. I think you'd have a construct a separate
FargateTaskEnvironment
with different
taskRoleArn
for each role.
The task that pulls down the flow from the API uses the task role from the agent configuration. I'm not sure what to call that launching task 🤷‍♂️
z
Oh so the task that pulls the config gets passed the role from the agent. What about the flow task itself (that runs the flow)?
Yea we have a single tenancy requirement which demands each customer to have their isolated environment. Our other option would be to delegate executions to something like AWS batch or Databricks and the prefect task would not have direct access to customer data.
s
The flow task itself should use the FargateTaskEnvironment ARNs
I'm not sure how you could have the roles be dynamically applied in the flows.. Perhaps you can specify a dynamic role that the task should assume within the task code? Having a root role that can only
sts:AssumeRole
that the tasks can use directly; then you setup boto3 with an
AssumeRoleCredentialsProvider
(or some such; I know boto3 doesn't have this natively but other AWS SDKs do; for boto3 https://stackoverflow.com/a/45834847) in the session?
z
Fair thanks for the suggestion. Will look at the different strategies and get back to you.
a
Hi @Spencer! Regarding this https://prefect-community.slack.com/archives/C014Z8DPDSR/p1601671347075000?thread_ts=1601583340.049800&amp;cid=C014Z8DPDSR How do you specify taskRoleArn and taskExecutionRoleArn in the metadata field?
s
This comment is referring to using a custom docker container for your flow execution; you specify this in the FargateTaskEnvironment's metadata field
metadata={"image": ...}
Specifying the taskRoleArn and executionRoleArn are native fields in the FargateTaskEnvironment separate from the metadata field.
a
Ok, thank you! 👍