Thread
#prefect-community
    Bruno Murino

    Bruno Murino

    9 months ago
    Hi everyone — I’m struggling understanding how prefect behaves when I redeploy stuff. The way it works now is that 1 git repo = 1 prefect Project containing multiple flows. We have a dockerfile whose endpoint simply registers all the flows and then starts a LocalAgent. This is deployed as an ECS Service. When we deploy, we simply kill the existing service and start a new one with a new docker image — this means there is a little of “downtime” in terms of flow schedules and etc, like: • Time is 4.59 • We start a new deployment, meaning that at 4.59 the service is killed • There is a job scheduled to run at 5.00. • The service is only back up at 5.01. • This mean the 5.00 run will be fully missed. (please correct me if I’m wrong) My question, however, is about a second scenario, where we: • start run at 4.58, run takes about 5 minutes to finish, • We start new deployment at 4.59 — meaning there is a run in-progress • What happens with the flow run, if the entire service (local agent + code execution environment) is killed? • at 5.01 when the new deployment finishes, will prefect know to resume that flow run? how would it work? Reason I’m asking is because I plan on changing our deployment strategy to be blue/green, but I’m not sure how prefect will cope with a flow run being killed midway and etc Sorry if this is confusing! I appreciate any help.
    Anna Geller

    Anna Geller

    9 months ago
    @Bruno Murino you don't have to kill existing flow runs or agents when you deploy a new version of a flow. When your flow changes, you register a new version of it so that the next time you run it, you trigger a new version of the flow. This way you can entirely avoid any downtime related to new flow version deployment.
    Bruno Murino

    Bruno Murino

    9 months ago
    I’m struggling with understanding this because I’m using a LocalRun with a LocalAgent and LocalStorage — so the only way to change the flow code is by deploying a new docker image, and because I’m using a LocalAgent, this means I need the agent to be in that docker image as well
    Anna Geller

    Anna Geller

    9 months ago
    @Bruno Murino actually, with Local agent and storage, you don’t need to use Docker at all. You could deploy your agent e.g. in a virtual environment for isolation, but normally local agent is deployed as a local process without Docker. But there is also a Docker agent - perhaps this is what you’re looking for?
    when it comes to flow deployment patterns, some users shared how they do it in this Github discussion - sharing in case this might be interesting for you https://github.com/PrefectHQ/prefect/discussions/4042
    Bruno Murino

    Bruno Murino

    9 months ago
    we need docker because we deploy as an ECS Service
    Anna Geller

    Anna Geller

    9 months ago
    Do you want your flows to run on ECS too? We have an ECS agent: https://docs.prefect.io/orchestration/agents/ecs.html I could share a tutorial on how to set it up if you’re interested
    Bruno Murino

    Bruno Murino

    9 months ago
    I have tried that but it was too slow to start the flow run
    maybe I should give it a second try though
    Anna Geller

    Anna Geller

    9 months ago
    I see, I can actually totally understand that 🙂 the ECS with Fargate is really not the fastest option to start because the Serverless data plane needs to first provision the compute resources and pull the image before it can run the flow. But you could solve it by using EC2 instead of Fargate capacity provider.
    Bruno Murino

    Bruno Murino

    9 months ago
    we do use EC2 😂
    Anna Geller

    Anna Geller

    9 months ago
    Really? 😄 that’s weird. With EC2 data plane it should start up pretty much instantaneously. Could it be that the capacity provider was still set to Fargate?
    Bruno Murino

    Bruno Murino

    9 months ago
    don’t think so — we don’t have anything on fargate and I had to deal with our VPC and stuff, so it was definitely EC2
    Anna Geller

    Anna Geller

    9 months ago
    Anyway, in case you would want to go that path, you could use ECS CLI to spin up the cluster and then follow this or this to set up an agent as ECS service with the exception to change the capacity provider from FARGATE to EC2.
    Bruno Murino

    Bruno Murino

    9 months ago
    there might had been a problem with the ECS agent we had — it was deployed as an ECS Service and it would get an OOM error every 3 days or so, with no apaprent reason
    Anna Geller

    Anna Geller

    9 months ago
    hmm it never happened to me. Perhaps you could allocate more memory to the agent then?
    Bruno Murino

    Bruno Murino

    9 months ago
    yea I think I’ll give that a try
    didn’t seem to be the issue, the memory profile was a steady increase at all times, not related to any process
    Anna Geller

    Anna Geller

    9 months ago
    usually the agent doesn’t need a lot of memory because it doesn’t do any work by itself, it only spins up flows as ECS tasks
    Bruno Murino

    Bruno Murino

    9 months ago
    exactly, so I thought it was a bug at the time or something
    I think it’s worth trying that approach again, with newer versions and etc
    as it does look like deploying as ecs task might solve the downtime issue
    @Anna Geller I’m setting up some flows with ECS Run and something seems a bit odd — the task definition (visible in AWS console) contains the api key to prefect cloud — is there any way to avoid that?
    Anna Geller

    Anna Geller

    9 months ago
    yes, there is! There are two ways to store credentials that are retrieved by ECS tasks:1. AWS Parameter Store 2. AWS Secrets Manager The blog post I linked handles that and includes a section on how to set it up using #1. You can also check out #3 in this blog https://aws.plainenglish.io/8-common-mistakes-when-using-aws-ecs-to-manage-containers-3943402e8e59?sk=334f367ff27d3fe9b56ff31f8b9ba447
    Bruno Murino

    Bruno Murino

    9 months ago
    I do use AWS secrets manager on ECS task definition that I create — but now Prefect is creating them and adding a bunch of environment variables
    like this
    Anna Geller

    Anna Geller

    9 months ago
    no no, Prefect will use the value from parameter or Secret - have a look at this section:
    Bruno Murino

    Bruno Murino

    9 months ago
    this is what I have setup:
    Anna Geller

    Anna Geller

    9 months ago
    but then you didn’t follow this tutorial, right? 😄 if you were, it would look like this - it must be an ARN to the Secret or Parameter
    Bruno Murino

    Bruno Murino

    9 months ago
    apologies — what tutorial are you referring to? the link you sent doesn’t mention Prefect at any point
    Bruno Murino

    Bruno Murino

    9 months ago
    on the link you sent the task definition is created for the ECS Agent — which is fine — but my issue with on the task definition for flows
    btw I really do appreciate your help!
    also to clarity — the task definition for the Flows is also fine — is when the flow runs, the ECS task runs — then you can go to the ECS task run and on the AWS console it lists a bunch of env vars that were not setup by my code or myself, but were set by the Prefect ECS Agent, I believe
    this is the task definition created when I submit a run of a flow registered with a run_config of ECSRun
    and this is the task details, for a task that used the task-definition above
    Anna Geller

    Anna Geller

    9 months ago
    nice! so it looks like you git it working? LMK if you have any open questions.
    Bruno Murino

    Bruno Murino

    9 months ago
    well not really! haha sorry for the confusion
    let me rephrase
    when the ECS Agent instantiates an ECS Task, I can check out the ECS Task run in the AWS console and it there are a bunch of env vars showing, which include the env var with the api key
    which is demonstrated by the last screenshot I sent
    to clarify, the task does run fine — my question is just about security concerns and etc
    Anna Geller

    Anna Geller

    9 months ago
    @Bruno Murino I see. The only way this may happen is if your ECSRun is using a different task definition than your agent. If you don’t set it explicitly, the one from the agent will be used. I’ve recently updated the docstring in the ECSRun to include more examples - perhaps you can reference the same task definition ARN as the one used by your agent?
    Bruno Murino

    Bruno Murino

    9 months ago
    I’m afraid the task run of the flow has to be different from the task definition of the agent — mainly because they run different images
    let me try something
    no luck — I was hoping I’d be able to avoid passing a full task definition and just passing the details through other arguments, but the lack of a “networkMode” argument makes me have to use a custom task definition from scratch
    also, I’m not sure that would have solved — I suspect what Prefect does is a “container overrides” to inject all envs vars it requires — and that bubbles up to the AWS console
    because the actual task definition is as I coded when registering the flow
    I’m wondering if I can pass some “container overrides” via the
    run_task_kwargs
    to set the api key variables to be fetched from aws secrets/parameter store
    ah ok haha
    Anna Geller

    Anna Geller

    9 months ago
    network mode belongs to the task definition, but the exact network details such as VPC id and subnet id belong to run task
    Bruno Murino

    Bruno Murino

    9 months ago
    well I need “networkMode = bridge” anyway haha
    Anna Geller

    Anna Geller

    9 months ago
    true, for those you don’t need VPC details, correct
    Bruno Murino

    Bruno Murino

    9 months ago
    do you mind commenting on my assumption “I suspect what Prefect does is a “container overrides” to inject all envs vars it requires — and that bubbles up to the AWS console”
    Anna Geller

    Anna Geller

    9 months ago
    correct, if you set something explicitly on ECSRun, it will serve as overrides to what was configured on the agent. You can see more here: https://github.com/PrefectHQ/prefect/blob/d44b72a950ebda9f7bc6a9712fc71e2e9c680d25/src/prefect/agent/ecs/agent.py#L444-L475
    Bruno Murino

    Bruno Murino

    9 months ago
    Nice! Thanks for showing me! From what I gather, if I set a container overrides on some env vars it will get ignored, however since the overrides I want to place are “secrets” instead of “environment”, it should still be applied. The bit I need to test is if some env var is in both the “environment” section and the “secrets” section, then which way takes precedence
    Do you think it's worth trying to contribute to that part of prefect? Goal would be to accept an aws secret/parameter store arn as part of the ecs run config, so that nowhere in the AWS console the api key is viaible