Back again with more noob questions I ve hopefully successfu Prefect Community #ask-community

Back again with more noob questions! I’ve [hopefu...

Darragh

06/04/2020, 2:31 PM

Back again with more noob questions! I’ve [hopefully] successfully configured my Fargate Agent, to the point where the flow gets picked up and I get the following log output

[2020-06-04 11:01:26,972] INFO - agent | Deploying flow run 05ec3609-4457-4085-8cf3-d01a6d13ccb1

I can see that it’s created the TaskDefinition , but after that it just sits there and never runs the Task, so I’ve no idea what’s going on, and the UI and logs never show any update. Am I missing something? Some magic words to make the agent actually run the task as opposed to just registering? Being able to turn on debug logging would also be a massive bonus 🙂 All I’m getting from the agent is INFO, and I’ve added the

[logging] level="DEBUG"

to the config.toml, and added

export PREFECT__LOGGING__LEVEL="DEBUG"

but nothings happening UPDATE In a fit of desperation I went on to my EC2 box and edited the fargate agent code to make all DEBUG statements INFO, and lo and behold, more log outpu. BUT!!! As far as the

deploy_flow

function is concerned everything succeeded. And I can see the task definition has been created, but nothing else ever happens in log s or the UI. So I’m slightly wiser but not by a lot. 🙂 Even digging into

deploy_and_update_flow_run

in the root agent.py file isn’t giving me much, it seems like everything is succeeding, so I’m at a loss as to why nothing is coming in the UI to show failed, passed, stalled… UPDATE 2 Ok, found something - I took the definition that prefect created and ran it manually, and it gets a

CannotStartContainerError

failed to initialise logging. AccessDenied to create Cloudwatch log group. Fixing. I’ll be back 😂 Does the fargate task need access to call back to the agent/graphql for updates? Finally found some logs and it seems like the flow could be trying to make a callback to localhost:4200, is that right? I realised this was stupid question as I wrote it. UPDATE 3 Turns out it does. From the Fargte Agent docs: Core server

In order to use this agent with Prefect Core's server the server's GraphQL API endpoint must be accessible.

But nothing to say how to configure it? I’ve overridden the cloud.api endpoint in the config.toml and I can see the update coming out in the logs, but the task is still trying to call back to localhost, presumably because the api endpoint override needs to be injected into the task? As in, into the docker container running the flow?

👀 1

Dylan

06/04/2020, 3:35 PM

@Darragh Apologies, Chris tells me I may have unintentionally mislead you

Dylan

06/04/2020, 3:35 PM

Is your agent running on the same machine as your Prefect Server?

Darragh

06/04/2020, 3:36 PM

The agent is, yes, and has access to the .prefect/config.toml file - but the execution of the flow happens on Fargate rather than on the machine the agent is running on.

Darragh

06/04/2020, 3:36 PM

So presumably the docker container DOES need to be aware of the “remote” API?

Dylan

06/04/2020, 3:36 PM

Yes, but you’re going to need to set it a little differently

Dylan

06/04/2020, 3:37 PM

One moment

Darragh

06/04/2020, 3:37 PM

Can I inject it into the Docker storage using env_vars?

Dylan

06/04/2020, 3:43 PM

The fargate agent actually passes the correct URI to your flows when it creates the fargate tasks. Since your agent is being started locally, it can talk to Prefect Server but then your flows think they can reach the server at

localhost

. You need to start the agent with the API configured so that it can pass the URI to the fargate tasks

Darragh

06/04/2020, 3:45 PM

So when you say configured correctly, I start the fargate agent [python fargate_agent.py] in a context where the following variable exists:

PREFECT__SERVER__UI__GRAPHQL__URL=<http://1.1.1.1:4200/graphql>

Darragh

06/04/2020, 3:45 PM

Is that right?

Dylan

06/04/2020, 3:45 PM

prefect agent start fargate

with the api set should work

Darragh

06/04/2020, 3:45 PM

Api set in what way?

Darragh

06/04/2020, 3:46 PM

PREFECT__CLOUD__API?

Dylan

06/04/2020, 3:46 PM

Yes, that’s the Env var 👍

Darragh

06/04/2020, 3:46 PM

Great, that’s a very quick test…

Dylan

06/04/2020, 3:46 PM

I was looking to see if

prefect agent start

has an

--api

flag

Dylan

06/04/2020, 3:48 PM

PREFECT__CLOUD__API

should do the trick

Darragh

06/04/2020, 3:54 PM

And is that the full path to graphql or the host and port?

Dylan

06/04/2020, 3:54 PM

The full path

Dylan

06/04/2020, 3:54 PM

https://

etc

Darragh

06/04/2020, 3:55 PM

Hmm. Still same problem. Can’t connect to localhost

Darragh

06/04/2020, 4:04 PM

What would the corresponding entry in the config.toml be?

Darragh

06/04/2020, 4:06 PM

[cloud]

api = "<http://1.1.1.1:4200/graphql>"

???

Dylan

06/04/2020, 4:07 PM

Yes

Darragh

06/04/2020, 4:08 PM

Thanks, going to test that now. Out of curioisty which takes precedence, env var or config.toml?

Darragh

06/04/2020, 4:08 PM

Presumably the env?

Dylan

06/04/2020, 4:09 PM

Yes, the ENV takes precedence

Darragh

06/04/2020, 4:10 PM

Ok, so one of my other questions from way up 😄 if I set

PREFECT__LOGGING__LEVEL=DEBUG

then that should get picked up too?

Dylan

06/04/2020, 4:11 PM

Yes, definitely 👍

Darragh

06/04/2020, 4:16 PM

Right, I have a bigger problem so. Neither of those env vars actually get picked up and used

Darragh

06/04/2020, 4:16 PM

I have the LOGGING one set in the env that the agent runs in, but it never logs debug statements

Dylan

06/04/2020, 4:17 PM

🧐

Darragh

06/04/2020, 4:17 PM

And the

PREFECT__CLOUD__API

is [I think] picked up by the agent, but NOT the flow it runs on fargate

Darragh

06/04/2020, 4:18 PM

But something strikes me - I’m not calling the

prefect agent start fargate

command directly from CLI, I have it wrapped in a python script so I can pass the necessary Fargate params to it. Would that cause an issue with passing env vars through?

Dylan

06/04/2020, 4:19 PM

Maybe

Dylan

06/04/2020, 4:20 PM

In your script, could you try setting the environment variables as part of the call to the CLI?

PREFECT__LOGGING__LEVEL=DEBUG prefect agent start fargate

etc?

Darragh

06/04/2020, 4:20 PM

Think that might be the culprit, looking at the docs: `You should continue to use the

env_vars

kwarg to pass agent level environment variables to your tasks.`

Darragh

06/04/2020, 4:21 PM

https://docs.prefect.io/orchestration/agents/fargate.html

Darragh

06/04/2020, 4:22 PM

Sorry when I say I’m using a pythion wrapper I mean a full on python script, not just a shell file calling the CLI. It creates a FargateAgent, conmfigures it, and then calls agent.start(), so reading the docs I think I need to pass the env_vars down

Darragh

06/04/2020, 4:23 PM

Buggered if I can see exactly how though 😄

Dylan

06/04/2020, 4:23 PM

I think that’s correct, I believe there’s an argument for the env_vars

Dylan

06/04/2020, 4:24 PM

https://docs.prefect.io/api/latest/agent/fargate.html#fargateagent

Dylan

06/04/2020, 4:24 PM

That’s correct 👍

Darragh

06/04/2020, 4:25 PM

Bewdiful. Now I need to figure out can I do it nice and simply by going env_Vars=env_Vars or do I have to list the damn things manually 😄

😂 1

Darragh

06/04/2020, 4:41 PM

Any idea what the syntax for passing the env_vars is ? Getting errors or silence when I’m passing them:

env_vars={PREFECT__LOGGING_LEVEL:DEBUG}

Darragh

06/04/2020, 5:58 PM

@Dylan I’m passing through

env_vars={'PREFECT___LOGGING___LEVEL':'DEBUG'}

into the FargateAgent() constructor but it doesn’t seem to be picking it up.

Dylan

06/04/2020, 5:59 PM

@Darragh that sounds like a bug. Can you open an issue with as munch info as you have? We’ll triage and address as soon as we can

Dylan

06/04/2020, 6:00 PM

But maybe also try

env_vars={'PREFECT__LOGGING__LEVEL':'debug'}

just on a hunch?

Darragh

06/04/2020, 6:02 PM

Yeah no problem. I don't mind so much about the logging, but the same issue happens when I'm trying to pass the Cloud_Api through as well, so I can't actually get a fargate task to run..

Dylan

06/04/2020, 6:02 PM

I understand, that’s very frustrating

Dylan

06/04/2020, 6:04 PM

I’m not personally very familiar with our fargate tools (we’re on GCP)

Dylan

06/04/2020, 6:05 PM

But I will say I think you’re the first user I’ve encountered who tried to run a fargate setup with Prefect Server (as opposed to Prefect Cloud) so it seems like there are a couple of rough edges we could make smoother

Dylan

06/04/2020, 6:06 PM

If you outline as many of your pain points as possible in the issue we’ll do our best to address them all

Dylan

06/04/2020, 6:06 PM

Thanks for helping us make Prefect better!

Darragh

06/04/2020, 6:07 PM

Will do thanks. in the meantime I think I'll have to embed all the options in a shell script and do it that way, hopefully that'll do as a workaround

👍 1

Joe Schmid

06/04/2020, 6:58 PM

@Darragh For config like you're talking about, we set that in environment variables that get created for the ECS task that the Fargate Agent creates. A portion of our config for our Fargate Agent looks like:

Copy code

containerDefinitions=[
        {
            "command": [],
            "environment": [
                {"name": "PREFECT__LOGGING__LOG_TO_CLOUD", "value": "true"},
                {"name": "AWS_DEFAULT_REGION", "value": REGION_NAME},

I think having the Fargate Agent make

PREFECT___CLOUD___API

available as an environment variable in the ECS task run for each Flow run seems very appropriate in your case.

Darragh

06/04/2020, 7:10 PM

Yeah it seems like the way to do it, but I couldn't get it to work through python so I'm switching to shell and prefect CLI like you guys 🤘

Darragh

06/04/2020, 8:23 PM

@Joe Schmid @Dylan Hi I’m back 😄 I think there’s a different problem going on that I’m missing. I’ve tried both ways to create the agent: • Python way (

agent = FargateAgent(……), agent.start()

) • Shell script:

export VAR=X, prefect agent start fargate

And neither are able to run my flow. In my CloudWatch logs, for each run I try, I get a huge bunch of python stacktrace, and the ever present

urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=4200)

Is there something stupidly obvious that I’m missing here? It doesn’t make sense that both methods work for other people and not for me. Sample Python and Shell script attached. Am I just being dumb? Cos I’m very confused now. If there’s any other data for diagnosis you need please let me know!

fargate_agent.py fargate_agent.sh

Joe Schmid

06/04/2020, 8:28 PM

Hi @Darragh, let me see if I can recap: 1. You're able to start the Fargate Agent succesfully and see evidence that it reaches the Prefect Server backend running in your environment 2. The Fargate Agent notices that a Flow run is ready and triggers a Flow run 3. An ECS Task Definition for the Flow is created and is run successfully 4. When the Flow run tries to execute in the newly created ECS task, it fails with

urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=4200)

If I have that correct, the ECS Task Definition doesn't have

PREFECT___CLOUD___API

set as an environment variable. To check, use the AWS web admin console and examine the task definition for the flow.

Darragh

06/04/2020, 8:30 PM

Hey Joe, will do - offline for a short bit , I'll update you when I get back to it

👍 1

Darragh

06/04/2020, 9:04 PM

@Joe Schmid Checked it there - the variable is set in the TaskDefinition, but it’s set to http://localhost:4200

Darragh

06/04/2020, 9:08 PM

If the values are set in the TaskDefintion then this would suggest it’s done in the register_task_definition as opposed to run_task

Darragh

06/04/2020, 9:09 PM

And that a new run is not updating the definition

Darragh

06/04/2020, 9:14 PM

VICTORY AT LAST!!

Darragh

06/04/2020, 9:15 PM

Deleted the task defintion and re-ran - this triggered a new TaskDefinition to be created, which was created with the correct value in the variable. Flow succeeded!

🚀 2

🎉 2

Darragh

06/04/2020, 9:17 PM

Interesting issue though - Restarting the agent with a new set of environment variables does not update variable values in any existing TaskDefinitions. Is that intentional, or is there a flag to recreate definitions each time?

Darragh

06/04/2020, 9:19 PM

Thanks so much for all the assistance guys, if you happen to have a “Buy me a Beer” button then pass it over 😄

🍻 2

Dylan

06/04/2020, 9:19 PM

Glad you got it working!

Joe Schmid

06/04/2020, 9:20 PM

Restarting the agent with a new set of environment variables does not update variable values in any existing TaskDefinitions

Check your Fargate Agent setting for

enable_task_revisions

That should either create a new revision for a task definition or an entire new task definition. (Docs here: https://docs.prefect.io/orchestration/agents/fargate.html )

Darragh

06/04/2020, 9:30 PM

Yeah that'll probably do it 😂😂

Dylan

06/04/2020, 9:45 PM

@Marvin archive “Setting Prefect Server URL with Fargate Agent”

shaking angry fist 2

Dylan

06/04/2020, 9:46 PM

@Chris White can you archive this when you get a moment?

Chris White

06/04/2020, 10:03 PM

@Marvin archive “Setting Prefect Server URL with Fargate Agent”

Marvin

06/04/2020, 10:03 PM

https://github.com/PrefectHQ/prefect/issues/2722

Chris White

06/04/2020, 10:03 PM

sorry @Dylan you will have access shortly!

👍 2

2 Views

Open in Slack

Previous Next