It looks like the terraform is using an older 4 x AWS provid Prefect Community #ask-community

It looks like the terraform is using an older 4.x ...

jason

05/07/2024, 5:34 PM

It looks like the terraform is using an older 4.x AWS provider. Any plans to upgrade to 5.x? https://github.com/PrefectHQ/prefect-recipes/blob/main/devops/infrastructure-as-code/aws/tf-prefect2-ecs-agent/main.tf

Dylan

05/07/2024, 6:18 PM

Hey Jason, we can definitely look into this 👍 Would you mind opening an issue on the repo?

jason

05/07/2024, 6:19 PM

Are you guys actively maintaining these repos?

Dylan

05/07/2024, 6:29 PM

As far as I am aware we are, we are

Dylan

05/07/2024, 6:29 PM

But looking closer at this specific example, you may be interested in our terraform provider: https://github.com/PrefectHQ/terraform-provider-prefect

jason

05/07/2024, 6:30 PM

Is that what we should use to spin up resources for prefect?

jason

05/07/2024, 6:30 PM

cc @Zach Munro

Dylan

05/07/2024, 6:31 PM

We have a few different ways depending on your preference (we also have a helm chart for example)

Dylan

05/07/2024, 6:32 PM

But for terraform yes the provider is our recommended way

jason

05/07/2024, 6:40 PM

Ok, do you have an example of how we'd use the provider to create a worker pool?

Edward Park

05/07/2024, 6:47 PM

hey @jason - each of these repos serve different use cases, based on what you might need: • you can use the prefect-recipes repository (which you originally linked to) as example implementations for running our older agents in certain non-k8s container environments, like ECS + using Terraform as the configuration • if you’re looking to spin up a workers in k8s, you could use our prefect-helm charts, which offer charts for workers, agents, and a prefect-server. workers/agents only run your flows, but require • if you’re looking to create Prefect Cloud objects, like workspaces and work pools, use the terraform provider - note that this project is still in active development and we’re still adding Prefect object support ◦ for ex., work pools can be configured with the

prefect_work_pools

terraform resource

jason

05/07/2024, 6:48 PM

We want to use fargate (no k8s) running a worker pool in our aws account

Edward Park

05/07/2024, 6:57 PM

got it. i would definitely take a look at our

worker

Fargate recipe, which should give you a good example of the necessary TF / AWS resources (eg. iam, execution policy, ecs cluster/service) you may need to tweak it a bit for your specific requirements, but our recipes are created from working examples that we’ve set up with our users/customers https://github.com/PrefectHQ/prefect-recipes/tree/main/devops/infrastructure-as-code/aws/tf-prefect2-ecs-worker

jason

05/07/2024, 6:58 PM

Ok, so just use the terraform template. We were on that path already will give it a shot. cc @Zach Munro

👍 1

jason

05/07/2024, 8:55 PM

Made some progress but now running into this when the ECS task tries to run a deployment: https://github.com/PrefectHQ/prefect/issues/11637 @Edward Park

Edward Park

05/07/2024, 9:00 PM

hmm. do you have the full output?

jason

05/07/2024, 9:01 PM

Failed to submit flow run 'a8411c17-0f0e-40c4-b0e8-062dc7b02b1c' to infrastructure.

Traceback (most recent call last):

File "/usr/local/lib/python3.10/site-packages/prefect/workers/base.py", line 904, in _submit_run_and_capture_errors

result = await self.run(

File "/usr/local/lib/python3.10/site-packages/prefect_aws/workers/ecs_worker.py", line 639, in run

) = await run_sync_in_worker_thread(

File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 136, in run_sync_in_worker_thread

return await anyio.to_thread.run_sync(

File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync

return await get_asynclib().run_sync_in_worker_thread(

File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread

return await future

File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run

result = context.run(func, *args)

File "/usr/local/lib/python3.10/site-packages/prefect_aws/workers/ecs_worker.py", line 755, in _create_task_and_wait_for_start

self._wait_for_task_start(

File "/usr/local/lib/python3.10/site-packages/prefect_aws/workers/ecs_worker.py", line 1033, in _wait_for_task_start

raise type(code, (RuntimeError,), {})(reason)

prefect_aws.workers.ecs_worker.TaskFailedToStart: CannotPullContainerError: pull image manifest has been retried 5 time(s): failed to resolve ref docker.io/prefecthq/prefect:2.18.3-python3.10: failed to do request: Head "https://registry-1.docker.io/v2/prefecthq/prefect/manifests/2.18.3-python3.10": dial tcp 54.227.20.253443 i/o timeout

025944 PM

prefect.flow_runs.worker

INFO

Completed submission of flow run 'a8411c17-0f0e-40c4-b0e8-062dc7b02b1c'

jason

05/07/2024, 9:02 PM

we're getting a timeout actually

jason

05/07/2024, 9:02 PM

It looks like the task is being spun up with a different network configuration than the other worker 🤔

Edward Park

05/07/2024, 9:02 PM

hmm. this is trying to talk to the public dockerhub registry, so im suspecting a networking issue

Edward Park

05/07/2024, 9:02 PM

yeah. do you know if your task is being spun up in the correct subnet

jason

05/07/2024, 9:02 PM

Doesn't look like it

jason

05/07/2024, 9:04 PM

We're going to try putting them in the public subnet to see if that helps. Curiously the other worker was in the public subnet already

👍 1

jason

05/07/2024, 9:10 PM

Going to pick this up tomorrow. Changing the subnet didn't help

jason

05/07/2024, 9:19 PM

Do we need to manually override the network configuraiton in the worker pool config?

Edward Park

05/07/2024, 9:22 PM

are you using the

aws_ecs_service

TF resource? if so, you may need to set this to

true

https://github.com/PrefectHQ/prefect-recipes/blob/main/devops/infrastructure-as-code/aws/tf-prefect2-ecs-worker/ecs.tf#L69

Copy code

assign_public_ip = true

jason

05/07/2024, 9:23 PM

Will have a look tomorrow. The listener worker / poller spins up fine and has connectivity, the tasks it creates do not.

Edward Park

05/07/2024, 9:25 PM

ok, keep us posted

jason

05/08/2024, 2:56 PM

We're using

prefect_ecs_worker

jason

05/08/2024, 2:59 PM

It appears to have the public IP assignment... Let me make sure our service is actually using the latest task definition

jason

05/08/2024, 3:05 PM

This is our

aws_ecs_service

config which appears to be correct:

resource "aws_ecs_service" "prefect_worker_service" {

name = "prefect-worker-${var.name}"

cluster = aws_ecs_cluster.prefect_worker_cluster.id

desired_count = var.worker_desired_count

launch_type = "FARGATE"

// Public IP required for pulling secrets and images

// https://aws.amazon.com/premiumsupport/knowledge-center/ecs-unable-to-pull-secrets/

network_configuration {

security_groups = [aws_security_group.prefect_worker.id]

assign_public_ip = true

subnets = var.worker_subnets

}

task_definition = aws_ecs_task_definition.prefect_worker_task_definition.arn

}

jason

05/08/2024, 3:30 PM

It appears the worker that gets deployed uses a different task definition (prefect_default_<guid>) than the worker which is deployed. I can't find any reference to this in our terraform, what creates the new task definition?

Edward Park

05/08/2024, 8:37 PM

could you share your ECS work pool’s configuration?

Edward Park

05/08/2024, 8:37 PM

the work pool config defines the flow/task run’s ECS definition (separate from the worker’s task definition)

Edward Park

05/08/2024, 8:37 PM

you can think of the work pool config == the job’s task definition

jason

05/08/2024, 9:13 PM

We got it going but I'll say the indirection with the task definition and pool config is a little confusing. Took a minute to understand what was going on there.

14 Views

Open in Slack

Previous Next