Would anyone be able to share what `ECSRun` is doi...
# ask-community
e
Would anyone be able to share what
ECSRun
is doing under the hood to run tasks? I have set a subnet for the service I am running my Prefect tasks in and have disabled public IPs, but when I call
ECSRun
my tasks are running with a public IP address in a different subnet. From reading the documentation for
ecs.run_task
it seems as though you are required to specify a subnet ID if you are running a task on Fargate, which I am. Therefore, I’m wondering what subnet ID Prefect is using and whether it can forced to use the default subnet ID of the service. I am aware that you can override the arguments passed to
run_task
by
ECSRun
, however I don’t want to do this as I want to make my config agnostic to the account that it is being deployed in (I am deploying the same infra to different AWS accounts and don’t want to hard code the subnet ID which is passed to
ECSRun
)
k
Hey @Eddie Atkinson, Prefect will infer the default VPC and use the subnet from that. If you have other VPCs beyond the default, the inferring becomes unreliable from experience. Yes you can pass it to
ECSRun
but if you don’t like that approach, this information can also be passed on the agent side. It is not directly exposed as a flag by Prefect so you’d need to include it in the Task Definition.
e
but if you don’t like that approach, this information can also be passed on the agent side
Hi @Kevin Kho, do you have any examples of this? I assume I have to subclass
Agent
and override its
deploy_flows
method right?
k
No you just need to supply a custom task definition like this
e
Is the subnet information exposed in the
task_definition
? Looking through AWS’ documentation I can’t see a spot where it’s specified
k
Over here , it would be under Network mode, and then it has to be specified if your network mode is
awsvpc
by supplying a
NetworkConfiguration
I guess this is the best example i can find for the syntax
e
From the AWS docs you linked under the
networkMode
section:
If the network mode is
awsvpc
, the task is allocated an elastic network interface, and you must specify a
NetworkConfiguration
when you create a service or run a task with the task definition.
To me this implies that the
networkConfiguration
must be supplied at the level of the service, or in the call to
run_task
. I’m just waiting on a Docker image to push to get screenshots, but I have specified the subnets and security groups on the service, but these seem to be overridden when
run_task
is called by my prefect ECS agent.
I am thinking the solution is actually to use the
run-task-kwargs
CLI arg to ecs start to override the
networkConfiguration
passed to every call to
run_task
by the ECS agent starting my flows
k
I agree
run-task-kwargs
on the agent is a good solution if it works yep. Would like to see screenshots if you get the chance just for my understanding.
e
So I ran a flow and this was the ECS task that was spawned by prefect:
and this is the config for the service it was spawned into:
I would anticipate that it would have taken all the subnet / security group info from the service, but my guess is that
ECSRun
is doing its own thing here
k
I can look into this more tom, but I just wanna clarify that the service there is an agent?
e
It’s not an agent, it’s a service I provisioned for running prefect flows in. Apologies for the wall of text but hopefully my serverless.yml setup clarifies what I’m doing better:
Copy code
tasks:
      prefect-task:
        name: prefect-task
        image: ${env:PREFECT_DOCKER_IMAGE}
        desired: 0
        cpu: 1024 # 1vcpu
        memory: 2048 # 2Gb
        override:
          role: ${self:custom.ecsBaseTaskRoleName}
          container:
            Name: flow
            # container must be called 'flow' for prefect to run tasks
            # when using ECSRun: <https://docs.prefect.io/api/latest/run_configs.html#ecsrun>
            # non-obvious, but changing to use the AWS Log driver
            # and an awslogs-stream-prefix significantly affects data included
            # <https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_awslogs.html>
            LogConfiguration:
              LogDriver: "awslogs"
              Options:
                awslogs-group: ${self:custom.ecsClusterLogGroupName}
                awslogs-region: ${self:custom.region}
                awslogs-stream-prefix: ${self:custom.ecsTaskLogPrefix}

      prefect-task-server:
        name: ${self:custom.ecsTaskName}-prefect-task-server
        image: "prefecthq/prefect:latest"
        desired: 1
        cpu: 256 # 0.25vcpu
        memory: 512 # 512mb
        environment:
          EXEC_ROLE_ARN: ${self:custom.ecsBaseExecRoleArn}
          TASK_ROLE_ARN: ${self:custom.ecsBaseTaskRoleArn}
          LOG_LEVEL: INFO
          LABEL: feeds-jobs
          CLUSTER_ARN: ${self:custom.ecsClusterArn}
          AGENT_NAME: feeds-task-server
          AGENT_API_KEY: ${self:custom.prefectApiKey.prefect-api-key}

        override:
          role: ${self:custom.ecsBaseTaskRoleName}
          container:
            command:
              - "sh"
              - "-c"
              - "prefect agent ecs start --key $AGENT_API_KEY --execution-role-arn $EXEC_ROLE_ARN  --task-role-arn $TASK_ROLE_ARN --log-level $LOG_LEVEL --label $LABEL --cluster $CLUSTER_ARN --name $AGENT_NAME"
            # non-obvious, but changing to use the AWS Log driver
            # and an awslogs-stream-prefix significantly affects data included
            # <https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_awslogs.html>
            LogConfiguration:
              LogDriver: "awslogs"
              Options:
                awslogs-group: ${self:custom.ecsClusterLogGroupName}
                awslogs-region: ${self:custom.region}
                awslogs-stream-prefix: ${self:custom.ecsTaskLogPrefix}
What you see in the screenshot is the service for the
prefect-task
definition above. My agent runs in the
prefect-task-server
service
k
I am a bit confused why you need a service since ECSRun from Prefect will register a task defnition and run it?
e
Because I wanted to package my own libraries in the docker image and call them from the flow
k
And you don’t want to supply it as an image to ECSRun?
e
Ideally not no. The idea being that if I deploy this infrastructure across accounts the same task family will exist in both accounts and I won’t need to change an image ARN in my flow code (the images are deployed to a private ECR repo in each account)
k
Gotcha ok. I’ll look into this more tomorrow 👍
e
Thanks for you patience. If I am attempting to fight the framework too much I am happy to consider an alternate approach
k
No problem! Nah I’m just trying to understand what you’re going for better and see if we can make it work
👍 1
This will be a wall of text. I hope this clears things up a bit. I went through the YAML and read a bit and I think I understand what is going on. Just some clarifications first. So the
ECS Service
that you provisioned is indeed a
Prefect agent
. The Service is responsible for picking up the
Prefect flow
and then starting the
ECS Task
. There are two ways to set the task definition stuff. The first is through
ECSRun
(which we don’t want to do). The second is through the
Prefect agent
, which is the
ECS Service
you provisioned. In order to provision this on the
Prefect agent
, you need to do something like
Copy code
prefect agent ecs start --task-definition /path/to/my_definition.yaml
or
Copy code
prefect agent ecs start --run-task-kwargs /path/to/options.yaml
the point is that these are loaded in upon agent start. You can find that code snippet here. It loads in these configurations, and then when the flow starts, the
RunConfig
items override what was in these `yaml`files, and then it kicks of the
Prefect flow
as an
ECS task
. So when that
ECS Service
is created for the
Prefect agent
, it already has a fixed configuration that can only be overriden by the
ECSRun
. Now, you mentioned that you attached some network information to the
ECS Service (Prefect agent)
, but looking at the command in your yaml,
Copy code
- "prefect agent ecs start --key $AGENT_API_KEY --execution-role-arn $EXEC_ROLE_ARN  --task-role-arn $TASK_ROLE_ARN --log-level $LOG_LEVEL --label $LABEL --cluster $CLUSTER_ARN --name $AGENT_NAME"
I am not seeing anything beyond the cluster arn. This needs either a
task definition
or
run_task_kwargs
that will hold the
networkConfiguration
. One it’s attached here, it would apply to all of the Flows that it picks up unless overridden by the
ECSRun
So what is happening is that your
ECS Service
is starting in some subnet, but that’s not necessarily the same subnet your
ECS Task/Prefect Flow
will be unless explicitly specified. I believe there are just some default subnets (maybe 3), and unless specified, it would use one of the defaults. There is nothing tying the
ECS Service
subnet to the
ECS Task
because the
Prefect Agent
is simply a process that is starting the
Prefect Flow/ECS Task
without knowing anything about where it is running. For example, if you started an
ecs agent
on your local machine and deployed a
Prefect Flow
, it would just grab a default subnet. The same thing is going on here. So if the
networkConfiguration
is not specified for the agent, Prefect can infer it. Unfortunately, the inferring of
networkConfiguration
happens during agent startup here , which means that it infers something and then will pass that to all flows that it starts. In short, the agent is already locked to one
networkConfiguration
by the time the flows come in. That might be fine for you. It just means that you need to spin up different agents, each with their own configuration, so that you don’t need to specify this in the
ECSRun
. I think this is what you were going for, so yes it can be one with multiple
Prefect agents
, but they would have to be separate
ECS Services
because the
ECS Service
only holds one
task definition
at a time I think. Each one of these would be configured with different
networkConfiguration
, and then that would dictate where the Flows run. In order to send the right Flow to the right agent, you would need to specify labels to pair them together.
e
Hi Kevin, I understand. Thank you for looking into this for me and clearly explaining what is happening. My solution to this problem was to define a custom container for the agent and run a Python script that generates a
run_task_kwargs
yaml override file at runtime when the container starts.
Copy code
import os
import yaml

ENV_VAR_NAMES = [
    "SECURITY_GROUP_ID",
    "SUBNET_ID",
]

OVERRIDE_FILE_NAME = "run_task_overrides.yml"
env_var_values = {k: os.environ[k] for k in ENV_VAR_NAMES}


run_task_overrides = {
    "networkConfiguration": {
        "awsvpcConfiguration": {
            "assignPublicIp": "DISABLED",
            "securityGroups": [env_var_values["SECURITY_GROUP_ID"]],
            "subnets": [env_var_values["SUBNET_ID"]],
        }
    }
}


with open(OVERRIDE_FILE_NAME, "w") as outfile:
    outfile.write(yaml.dump(run_task_overrides))

os.system(
    f"prefect agent ecs start --key $AGENT_API_KEY --execution-role-arn $EXEC_ROLE_ARN  --task-role-arn $TASK_ROLE_ARN --log-level $LOG_LEVEL --label $LABEL --cluster $CLUSTER_ARN --name $AGENT_NAME --run-task-kwargs {OVERRIDE_FILE_NAME}"
)
k
Yes I think that makes perfect sense. If you’re gonna do this, you can use the ECSAgent class directly and run the
start
method. Maybe a bit nicer to work with.
e
Actually I had a question about that, why is there no option in the constructor of the agent for a Prefect API key? Do I need it in my
auth.toml
instead?
k
Good question, if I have to guess, Prefect code is normally written in a way that doesn’t allow secrets to be exposed a lot of the time. I think you can do it through environment variables with
PREFECT__CLOUD__API_KEY
using the
env_vars
. None of the agents seem to accept the key in the construction
That seems to be how the CLI does it
e
Awesome thanks Kevin, I’ll make the change