Back again with more noob questions! I’ve [hopefu...
# prefect-community
d
Back again with more noob questions! I’ve [hopefully] successfully configured my Fargate Agent, to the point where the flow gets picked up and I get the following log output
[2020-06-04 11:01:26,972] INFO - agent | Deploying flow run 05ec3609-4457-4085-8cf3-d01a6d13ccb1
I can see that it’s created the TaskDefinition , but after that it just sits there and never runs the Task, so I’ve no idea what’s going on, and the UI and logs never show any update. Am I missing something? Some magic words to make the agent actually run the task as opposed to just registering? Being able to turn on debug logging would also be a massive bonus 🙂 All I’m getting from the agent is INFO, and I’ve added the
[logging] level="DEBUG"
to the config.toml, and added
export PREFECT__LOGGING__LEVEL="DEBUG"
but nothings happening UPDATE In a fit of desperation I went on to my EC2 box and edited the fargate agent code to make all DEBUG statements INFO, and lo and behold, more log outpu. BUT!!! As far as the
deploy_flow
function is concerned everything succeeded. And I can see the task definition has been created, but nothing else ever happens in log s or the UI. So I’m slightly wiser but not by a lot. 🙂 Even digging into
deploy_and_update_flow_run
in the root agent.py file isn’t giving me much, it seems like everything is succeeding, so I’m at a loss as to why nothing is coming in the UI to show failed, passed, stalled… UPDATE 2 Ok, found something - I took the definition that prefect created and ran it manually, and it gets a
CannotStartContainerError
failed to initialise logging. AccessDenied to create Cloudwatch log group. Fixing. I’ll be back 😂 Does the fargate task need access to call back to the agent/graphql for updates? Finally found some logs and it seems like the flow could be trying to make a callback to localhost:4200, is that right? I realised this was stupid question as I wrote it. UPDATE 3 Turns out it does. From the Fargte Agent docs: Core server
In order to use this agent with Prefect Core's server the server's GraphQL API endpoint must be accessible.
But nothing to say how to configure it? I’ve overridden the cloud.api endpoint in the config.toml and I can see the update coming out in the logs, but the task is still trying to call back to localhost, presumably because the api endpoint override needs to be injected into the task? As in, into the docker container running the flow?
👀 1
d
@Darragh Apologies, Chris tells me I may have unintentionally mislead you
Is your agent running on the same machine as your Prefect Server?
d
The agent is, yes, and has access to the .prefect/config.toml file - but the execution of the flow happens on Fargate rather than on the machine the agent is running on.
So presumably the docker container DOES need to be aware of the “remote” API?
d
Yes, but you’re going to need to set it a little differently
One moment
d
Can I inject it into the Docker storage using env_vars?
d
The fargate agent actually passes the correct URI to your flows when it creates the fargate tasks. Since your agent is being started locally, it can talk to Prefect Server but then your flows think they can reach the server at
localhost
. You need to start the agent with the API configured so that it can pass the URI to the fargate tasks
d
So when you say configured correctly, I start the fargate agent [python fargate_agent.py] in a context where the following variable exists:
PREFECT__SERVER__UI__GRAPHQL__URL=<http://1.1.1.1:4200/graphql>
Is that right?
d
prefect agent start fargate
with the api set should work
d
Api set in what way?
PREFECT__CLOUD__API?
d
Yes, that’s the Env var 👍
d
Great, that’s a very quick test…
d
I was looking to see if
prefect agent start
has an
--api
flag
PREFECT__CLOUD__API
should do the trick
d
And is that the full path to graphql or the host and port?
d
The full path
https://
etc
d
Hmm. Still same problem. Can’t connect to localhost
What would the corresponding entry in the config.toml be?
[cloud]
api = "<http://1.1.1.1:4200/graphql>"
???
d
Yes
d
Thanks, going to test that now. Out of curioisty which takes precedence, env var or config.toml?
Presumably the env?
d
Yes, the ENV takes precedence
d
Ok, so one of my other questions from way up 😄 if I set
PREFECT__LOGGING__LEVEL=DEBUG
then that should get picked up too?
d
Yes, definitely 👍
d
Right, I have a bigger problem so. Neither of those env vars actually get picked up and used
I have the LOGGING one set in the env that the agent runs in, but it never logs debug statements
d
🧐
d
And the
PREFECT__CLOUD__API
is [I think] picked up by the agent, but NOT the flow it runs on fargate
But something strikes me - I’m not calling the
prefect agent start fargate
command directly from CLI, I have it wrapped in a python script so I can pass the necessary Fargate params to it. Would that cause an issue with passing env vars through?
d
Maybe
In your script, could you try setting the environment variables as part of the call to the CLI?
PREFECT__LOGGING__LEVEL=DEBUG prefect agent start fargate
etc?
d
Think that might be the culprit, looking at the docs: `You should continue to use the 
env_vars
 kwarg to pass agent level environment variables to your tasks.`
Sorry when I say I’m using a pythion wrapper I mean a full on python script, not just a shell file calling the CLI. It creates a FargateAgent, conmfigures it, and then calls agent.start(), so reading the docs I think I need to pass the env_vars down
Buggered if I can see exactly how though 😄
d
I think that’s correct, I believe there’s an argument for the env_vars
That’s correct 👍
d
Bewdiful. Now I need to figure out can I do it nice and simply by going env_Vars=env_Vars or do I have to list the damn things manually 😄
😂 1
Any idea what the syntax for passing the env_vars is ? Getting errors or silence when I’m passing them:
env_vars={PREFECT__LOGGING_LEVEL:DEBUG}
@Dylan I’m passing through
env_vars={'PREFECT___LOGGING___LEVEL':'DEBUG'}
into the FargateAgent() constructor but it doesn’t seem to be picking it up.
d
@Darragh that sounds like a bug. Can you open an issue with as munch info as you have? We’ll triage and address as soon as we can
But maybe also try
env_vars={'PREFECT__LOGGING__LEVEL':'debug'}
just on a hunch?
d
Yeah no problem. I don't mind so much about the logging, but the same issue happens when I'm trying to pass the Cloud_Api through as well, so I can't actually get a fargate task to run..
d
I understand, that’s very frustrating
I’m not personally very familiar with our fargate tools (we’re on GCP)
But I will say I think you’re the first user I’ve encountered who tried to run a fargate setup with Prefect Server (as opposed to Prefect Cloud) so it seems like there are a couple of rough edges we could make smoother
If you outline as many of your pain points as possible in the issue we’ll do our best to address them all
Thanks for helping us make Prefect better!
d
Will do thanks. in the meantime I think I'll have to embed all the options in a shell script and do it that way, hopefully that'll do as a workaround
👍 1
j
@Darragh For config like you're talking about, we set that in environment variables that get created for the ECS task that the Fargate Agent creates. A portion of our config for our Fargate Agent looks like:
Copy code
containerDefinitions=[
        {
            "command": [],
            "environment": [
                {"name": "PREFECT__LOGGING__LOG_TO_CLOUD", "value": "true"},
                {"name": "AWS_DEFAULT_REGION", "value": REGION_NAME},
I think having the Fargate Agent make
PREFECT___CLOUD___API
available as an environment variable in the ECS task run for each Flow run seems very appropriate in your case.
d
Yeah it seems like the way to do it, but I couldn't get it to work through python so I'm switching to shell and prefect CLI like you guys 🤘
@Joe Schmid @Dylan Hi I’m back 😄 I think there’s a different problem going on that I’m missing. I’ve tried both ways to create the agent: • Python way (
agent = FargateAgent(……), agent.start()
) • Shell script:
export VAR=X, prefect agent start fargate
And neither are able to run my flow. In my CloudWatch logs, for each run I try, I get a huge bunch of python stacktrace, and the ever present
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=4200)
Is there something stupidly obvious that I’m missing here? It doesn’t make sense that both methods work for other people and not for me. Sample Python and Shell script attached. Am I just being dumb? Cos I’m very confused now. If there’s any other data for diagnosis you need please let me know!
j
Hi @Darragh, let me see if I can recap: 1. You're able to start the Fargate Agent succesfully and see evidence that it reaches the Prefect Server backend running in your environment 2. The Fargate Agent notices that a Flow run is ready and triggers a Flow run 3. An ECS Task Definition for the Flow is created and is run successfully 4. When the Flow run tries to execute in the newly created ECS task, it fails with
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=4200)
If I have that correct, the ECS Task Definition doesn't have
PREFECT___CLOUD___API
set as an environment variable. To check, use the AWS web admin console and examine the task definition for the flow.
d
Hey Joe, will do - offline for a short bit , I'll update you when I get back to it
👍 1
@Joe Schmid Checked it there - the variable is set in the TaskDefinition, but it’s set to http://localhost:4200
If the values are set in the TaskDefintion then this would suggest it’s done in the register_task_definition as opposed to run_task
And that a new run is not updating the definition
VICTORY AT LAST!!
Deleted the task defintion and re-ran - this triggered a new TaskDefinition to be created, which was created with the correct value in the variable. Flow succeeded!
🚀 2
🎉 2
Interesting issue though - Restarting the agent with a new set of environment variables does not update variable values in any existing TaskDefinitions. Is that intentional, or is there a flag to recreate definitions each time?
Thanks so much for all the assistance guys, if you happen to have a “Buy me a Beer” button then pass it over 😄
🍻 2
d
Glad you got it working!
j
Restarting the agent with a new set of environment variables does not update variable values in any existing TaskDefinitions
Check your Fargate Agent setting for
enable_task_revisions
That should either create a new revision for a task definition or an entire new task definition. (Docs here: https://docs.prefect.io/orchestration/agents/fargate.html )
d
Yeah that'll probably do it 😂😂
d
@Marvin archive “Setting Prefect Server URL with Fargate Agent”
shaking angry fist 2
@Chris White can you archive this when you get a moment?
c
@Marvin archive “Setting Prefect Server URL with Fargate Agent”
c
sorry @Dylan you will have access shortly!
👍 2