Darragh
06/04/2020, 2:31 PM[2020-06-04 11:01:26,972] INFO - agent | Deploying flow run 05ec3609-4457-4085-8cf3-d01a6d13ccb1
I can see that it’s created the TaskDefinition , but after that it just sits there and never runs the Task, so I’ve no idea what’s going on, and the UI and logs never show any update. Am I missing something? Some magic words to make the agent actually run the task as opposed to just registering?
Being able to turn on debug logging would also be a massive bonus 🙂 All I’m getting from the agent is INFO, and I’ve added the [logging] level="DEBUG"
to the config.toml, and added export PREFECT__LOGGING__LEVEL="DEBUG"
but nothings happening
UPDATE
In a fit of desperation I went on to my EC2 box and edited the fargate agent code to make all DEBUG statements INFO, and lo and behold, more log outpu.
BUT!!! As far as the deploy_flow
function is concerned everything succeeded. And I can see the task definition has been created, but nothing else ever happens in log s or the UI. So I’m slightly wiser but not by a lot. 🙂 Even digging into deploy_and_update_flow_run
in the root agent.py file isn’t giving me much, it seems like everything is succeeding, so I’m at a loss as to why nothing is coming in the UI to show failed, passed, stalled…
UPDATE 2
Ok, found something - I took the definition that prefect created and ran it manually, and it gets a CannotStartContainerError
failed to initialise logging. AccessDenied to create Cloudwatch log group. Fixing.
I’ll be back 😂
Does the fargate task need access to call back to the agent/graphql for updates? Finally found some logs and it seems like the flow could be trying to make a callback to localhost:4200, is that right? I realised this was stupid question as I wrote it.
UPDATE 3
Turns out it does. From the Fargte Agent docs: Core server
In order to use this agent with Prefect Core's server the server's GraphQL API endpoint must be accessible.
But nothing to say how to configure it? I’ve overridden the cloud.api endpoint in the config.toml and I can see the update coming out in the logs, but the task is still trying to call back to localhost, presumably because the api endpoint override needs to be injected into the task? As in, into the docker container running the flow?Dylan
06/04/2020, 3:35 PMDarragh
06/04/2020, 3:36 PMDylan
06/04/2020, 3:36 PMDarragh
06/04/2020, 3:37 PMDylan
06/04/2020, 3:43 PMlocalhost
. You need to start the agent with the API configured so that it can pass the URI to the fargate tasksDarragh
06/04/2020, 3:45 PMPREFECT__SERVER__UI__GRAPHQL__URL=<http://1.1.1.1:4200/graphql>
Dylan
06/04/2020, 3:45 PMprefect agent start fargate
with the api set should workDarragh
06/04/2020, 3:45 PMDylan
06/04/2020, 3:46 PMDarragh
06/04/2020, 3:46 PMDylan
06/04/2020, 3:46 PMprefect agent start
has an --api
flagPREFECT__CLOUD__API
should do the trickDarragh
06/04/2020, 3:54 PMDylan
06/04/2020, 3:54 PMhttps://
etcDarragh
06/04/2020, 3:55 PM[cloud]
api = "<http://1.1.1.1:4200/graphql>"
???Dylan
06/04/2020, 4:07 PMDarragh
06/04/2020, 4:08 PMDylan
06/04/2020, 4:09 PMDarragh
06/04/2020, 4:10 PMPREFECT__LOGGING__LEVEL=DEBUG
then that should get picked up too?Dylan
06/04/2020, 4:11 PMDarragh
06/04/2020, 4:16 PMDylan
06/04/2020, 4:17 PMDarragh
06/04/2020, 4:17 PMPREFECT__CLOUD__API
is [I think] picked up by the agent, but NOT the flow it runs on fargateprefect agent start fargate
command directly from CLI, I have it wrapped in a python script so I can pass the necessary Fargate params to it. Would that cause an issue with passing env vars through?Dylan
06/04/2020, 4:19 PMPREFECT__LOGGING__LEVEL=DEBUG prefect agent start fargate
etc?Darragh
06/04/2020, 4:20 PMenv_vars
kwarg to pass agent level environment variables to your tasks.`Dylan
06/04/2020, 4:23 PMDarragh
06/04/2020, 4:25 PMenv_vars={PREFECT__LOGGING_LEVEL:DEBUG}
env_vars={'PREFECT___LOGGING___LEVEL':'DEBUG'}
into the FargateAgent() constructor but it doesn’t seem to be picking it up.Dylan
06/04/2020, 5:59 PMenv_vars={'PREFECT__LOGGING__LEVEL':'debug'}
just on a hunch?Darragh
06/04/2020, 6:02 PMDylan
06/04/2020, 6:02 PMDarragh
06/04/2020, 6:07 PMJoe Schmid
06/04/2020, 6:58 PMcontainerDefinitions=[
{
"command": [],
"environment": [
{"name": "PREFECT__LOGGING__LOG_TO_CLOUD", "value": "true"},
{"name": "AWS_DEFAULT_REGION", "value": REGION_NAME},
I think having the Fargate Agent make PREFECT___CLOUD___API
available as an environment variable in the ECS task run for each Flow run seems very appropriate in your case.Darragh
06/04/2020, 7:10 PMagent = FargateAgent(……), agent.start()
)
• Shell script: export VAR=X, prefect agent start fargate
And neither are able to run my flow. In my CloudWatch logs, for each run I try, I get a huge bunch of python stacktrace, and the ever present urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=4200)
Is there something stupidly obvious that I’m missing here? It doesn’t make sense that both methods work for other people and not for me.
Sample Python and Shell script attached. Am I just being dumb? Cos I’m very confused now.
If there’s any other data for diagnosis you need please let me know!Joe Schmid
06/04/2020, 8:28 PMurllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=4200)
If I have that correct, the ECS Task Definition doesn't have PREFECT___CLOUD___API
set as an environment variable. To check, use the AWS web admin console and examine the task definition for the flow.Darragh
06/04/2020, 8:30 PMDylan
06/04/2020, 9:19 PMJoe Schmid
06/04/2020, 9:20 PMRestarting the agent with a new set of environment variables does not update variable values in any existing TaskDefinitionsCheck your Fargate Agent setting for
enable_task_revisions
That should either create a new revision for a task definition or an entire new task definition. (Docs here: https://docs.prefect.io/orchestration/agents/fargate.html )Darragh
06/04/2020, 9:30 PMDylan
06/04/2020, 9:45 PMChris White
06/04/2020, 10:03 PMMarvin
06/04/2020, 10:03 PMChris White
06/04/2020, 10:03 PM