https://prefect.io logo
#prefect-community
Title
# prefect-community
a

Amruth VVKP

12/20/2022, 8:08 PM
Need help deploying Prefect Orion and Prefect Agent to ECS Fargate Clusters - Prefect 2.7.3 =========================================================================== Hi, I am trying to deploy Prefect Orion and Prefect Agent to ECS Fargate clusters. I've had a successful deployment with Docker-Compose but somehow I've been running around in circles deploying it to Fargate. Here's what I am trying to do - 1. Prefect Orion server - running on a Fargate task 1 with Public IP 2. Prefect Agent 1 - running on Fargate task 2, looks up to the deployed Orion server 3. Prefect Agent 2 - running on Fargate task 3, looks up to the deployed Orion server Before anything moves further, I plan on upgrading future tasks to run as ECS tasks (still need to play around with that, but it's something for later) What I have done so far is - create an ECR image built from custom flows and tasks that we have put in, create - 1. ECS Cluster 2. ECS Service and task for Prefect Orion that uses latest Prefect image from Docker-Hub 3. ECS Service and task for Agent-1 that uses custom image from ECR repo built over Prefect's latest docker image 4. ECS Service and task for Agent-2 that uses custom image from ECR repo built over Prefect's latest docker image I've put in a AWS CloudMap to generate a internal network that allows Prefect's agents to look up for Prefect Orion server using internally resolved DNS. This is running into few issues - 1. Prefect Orion starts but throws an error - Can't connect to Orion API at <CloudMap resolved DNS>/api or <CloudMap resolved DNS>:4200/api or 0.0.0.0:4200/api. The task definition does contain Prefect API listed as an environment variable and PostgreSQL DB fetched from AWS SecretsManager. The web UI is running up but can't seem to connect to the DB or show anything unders /setings. (Attached sreenshot) 2. Prefect agents on both the tasks are not starting up and fail with this error - Resourceinitializationerror: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve secret from asm: service call has been retried 5 time(s): failed to fetch secret arnawssecretsmanager:_**_** from secrets manager: RequestCanceled: request context canceled caused by: context deadline exceeded. The same secret passed to Prefect Orion Fargate task seem to work just fine.
m

Mason Menges

12/21/2022, 6:51 PM
Hey @Amruth VVKP admittedly I'm not an expert on ECS though you might try running through some of the suggestions mentioned here https://discourse.prefect.io/t/deploy-prefect-in-a-container-on-a-remote-vm/1915/17, There's a lot of overhead that can be difficult to pinpoint through a slack thread so some of those suggestion may help get you in the right direction.
a

Amruth VVKP

12/21/2022, 8:59 PM
I've managed to resolve issue #2, Prefect agents are starting up just fine. The problem was that assign_public_ip was initially set to False which resulted in failure of fetching secrets from AWS SecretsManager.
For issue #1 - Orion still seems to have an issue on start up, @Mason Menges - I've tried setting host to 0.0.0.0 and OrionAPI to a wide range of values in [http://0.0.0.0:4200/api, http://127.0.0.1:4200/api, http://localhost:4200/api, http://orion:4200/api] - Out of all the values, orion is the discovered service dns setup by AWS Service Connect https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-connect.html. So far all my attempts ended up with the same failure on opening up the Orion web page. I am using prefecthq/prefect:2-latest docker image from docker-hub. Any other suggestions? I could perhaps share my task definition if it helps.
m

Mason Menges

12/21/2022, 9:12 PM
Hmm I don't know if outside of the suggestion in the article I sent previously there's anything else i could add, generally speaking I'd say to make sure that you have at least these settings configured, there may be more necessary but these are the ones off the top of my head that I'm pretty sure need to be set • PREFECT_API_URL • PREFECT_ORION_API_HOST • PREFECT_ORION_API_PORT • PREFECT_ORION_DATABASE_CONNECTION_URL • PREFECT_ORION_DATABASE_PASSWORD (if necessary)
a

Amruth VVKP

12/21/2022, 10:02 PM
I am not sure what I am missing through, this is my terraform task definition
Copy code
# ECS Task Definition for Prefect's Orchestrator task
resource "aws_ecs_task_definition" "prefect_orchestrator_task" {
  family = var.prefect_orchestrator_name
  cpu    = var.agent_cpu
  memory = var.agent_memory

  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"

  // Execution role allows ECS to create tasks and services
  execution_role_arn = data.aws_iam_role.ecs_task_execution_role.arn

  // Task role allows tasks and services to access other AWS resources
  task_role_arn = data.aws_iam_role.prefect_orchestrator_role.arn

  container_definitions = jsonencode([
    {
      name  = "${var.prefect_orchestrator_name}-task-definition"
      image = "${var.orchestrator_image}" # var.orchestrator_image = prefecthq/prefect:2-latest

      command = ["prefect", "orion", "start", "--host", "0.0.0.0", "--log-level", "WARNING"]

      essential = true

      cpu    = "${var.agent_cpu}"
      memory = "${var.agent_memory}"

      portMappings = [
        {
          name          = var.orchestrator_service_port_name
          containerPort = var.prefect_port # var.prefect_port = 4200
          hostPort      = var.prefect_port
          appProtocol   = "http"
        }
      ]

      environment = [
        {
          name  = "PREFECT_API_URL "
          value = "<http://0.0.0.0>:${var.prefect_port}/api"
        },
        {
          name  = "PREFECT_ORION_API_HOST "
          value = "0.0.0.0"
        },
        {
          name  = "PREFECT_ORION_API_PORT "
          value = tostring("${var.prefect_port}")
        }
      ]
      secrets = [
        {
          name      = "PREFECT_ORION_DATABASE_CONNECTION_URL"
          valueFrom = "${aws_secretsmanager_secret.prefect_postgres_db_connection.arn}"
        }
      ]
      logConfiguration = {
        logDriver = "awslogs"
        options = {
          awslogs-group         = "${aws_cloudwatch_log_group.prefect_orchestrator_log_group.name}"
          awslogs-region        = "${data.aws_region.current.name}"
          awslogs-stream-prefix = "${var.prefect_orchestrator_name}"
        }
      }
    }

  ])
}
The environment variables from the ECS task looks as set (refer screenshot) Orion startup logs attached as screenshot
I finally figured out what was the problem, The command prefect orion start wouldn't update with the provided env variables post the creation of cluster on AWS Fargate. I saw an old demo by the Prefect team with Prefect 2 alpha version where Kevin Kho spun up something on AWS EC2 instance. He set the values with prefect config where he used the instance's public IP as PREFECT_ORION_UI_API_URL which did the job. With Fargate, I had to rewire and pass in the ALB DNS (which is the public IP that can be accessible before container is started, fargate usually assigns public IP to the container but it is only available to read once the container is spun). That did the trick and the Orion UI is not up. I guess a dedicated blog post is definitely needed to self host Orion on Fargate alongside spinning up any required agents (which has few samples in prefect-recipes GitHub post)
Reposting it to the community I finally figured out what was the problem, The command prefect orion start wouldn't update with the provided env variables post the creation of cluster on AWS Fargate. I saw an old demo by the Prefect team with Prefect 2 alpha version where Kevin Kho spun up something on AWS EC2 instance. He set the values with prefect config where he used the instance's public IP as PREFECT_ORION_UI_API_URL which did the job. With Fargate, I had to rewire and pass in the ALB DNS (which is the public IP that can be accessible before container is started, fargate usually assigns public IP to the container but it is only available to read once the container is spun). That did the trick and the Orion UI is not up. I guess a dedicated blog post is definitely needed to self host Orion on Fargate alongside spinning up any required agents (which has few samples in prefect-recipes GitHub post)
🙏 1
3 Views