Robert Banick
04/05/2024, 4:35 PMjob_variables
overrides are passed through to ECS containers in a push work pool? I have a push work pool set up to receive custom cloudwatch_logs_options
from each deployment linked to that work pool (stream_output
and configure_cloudwatch_logs
default to True). I’m able to pass through options like below, but flow runs’ logs don’t actually output to the specified log group.
"cloudwatch_logs_options": {
"awslogs-region": "us-east-1",
"awslogs-logs-group": "test-logs-group",
"awslogs-create-group": "false", # already exists
"awslogs-stream-prefix": "my_deployment_name"
}
A script I have to pull container definitions at runtime further reveals that no LogsOptions were configured at the container level — unlike previously under our Agent based system.
I’ve checked the related IAM role and it has CloudWatchLogsFullAccess enabled so it’s not a permissions issue. The work pool is anyways using credentials that worked under a previous Agent based system. Unfortunately, the lack of push work pool logs makes it impossible to troubleshoot further at the work pool level.
I’ve dug around quite a lot in the latest prefect
and prefect_aws
code since the docs are sparse on these topics but can’t seem to find where / how work pools pass through job variables and ECS ones in particular.
Help would be really appreciated on this topic! I’ve gone well into the weeds but feel like the returns to self-directed inquiry are shrinking.Jake Kaplan
04/05/2024, 4:49 PMconfigure_cloudwatch_logs
does need to be explicitly set to True (it does not default to on)Robert Banick
04/05/2024, 4:59 PMconfigure_cloudwatch_logs
and stream_output
to default to True in the job templateRobert Banick
04/05/2024, 5:01 PMc93d1948-f85b-4747-8c40-1b61f318e82e
Jake Kaplan
04/05/2024, 6:06 PMRobert Banick
04/05/2024, 6:55 PMJake Kaplan
04/05/2024, 8:59 PMJake Kaplan
04/05/2024, 9:01 PMContainerDefinitions
from your Task Definition revision? e.x.Jake Kaplan
04/05/2024, 9:02 PMRobert Banick
04/15/2024, 4:22 PMRobert Banick
04/15/2024, 4:22 PMRobert Banick
04/15/2024, 4:28 PMContainerDefinitions
JSON text
"containerDefinitions": [
{
"name": "prefect",
"image": "421396523132.dkr.ecr.us-east-1.amazonaws.com/prefect:gridded-etl-dev-latest",
"cpu": 0,
"links": [],
"portMappings": [],
"essential": true,
"entryPoint": [],
"command": [],
"environment": [],
"environmentFiles": [],
"mountPoints": [],
"volumesFrom": [],
"dnsServers": [],
"dnsSearchDomains": [],
"extraHosts": [],
"dockerSecurityOptions": [],
"dockerLabels": {},
"ulimits": [],
"systemControls": [],
"credentialSpecs": []
}
],
Robert Banick
04/15/2024, 4:44 PMPREFECT_LOGGING_EXTRA_LOGGERS
variable as a stream prefix….I was able to get the below working by setting the main task definition as such— meaning no involvement from the work pool or job_variables
for the deployment
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "gridded-etl-logs",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "${PREFECT_LOGGING_EXTRA_LOGGERS}"
}
},
Unfortunately it didn’t actually pull in the variable as the stream prefix. It just printed the literal
"${PREFECT_LOGGING_EXTRA_LOGGERS}"
as the prefixRobert Banick
04/15/2024, 5:40 PMjob_variables
are set up to provide the PREFECT_LOGGING_EXTRA_LOGGERS
variable, as so
{
"env": {
"PREFECT_LOGGING_EXTRA_LOGGERS": "cpc_temp_min"
}
}
Robert Banick
04/15/2024, 5:40 PMRobert Banick
04/15/2024, 5:42 PMjob_variables
, but as mentioned in the initial post this isn’t workingRobert Banick
04/15/2024, 6:23 PMJake Kaplan
04/15/2024, 6:30 PMJake Kaplan
04/15/2024, 6:32 PMcloudwatch_logs_options
, let me see if I am able to reproduce this. If I'm not I may need to enable some special debug logging and ask you to execute a couple of runsRobert Banick
04/15/2024, 6:32 PMRobert Banick
04/15/2024, 8:43 PMjob_variables
the flow hangs indefinitely until it crashes.
3. If I specify cloudwatch options as part of the Push Work Pool default settings the flow runs but there are no corresponding Log Configuration Options under the corresponding task — and indeed no logs are output to the specific log group.
Unfortunately without access to logs from the push work pool it’s hard to see further what’s going on.
As a side note, the removal of logs is effectively a regression in capability vs. Agents.Jake Kaplan
04/16/2024, 2:03 PMjob_variables
and see those populate on the registered task_definition.
I think I asked you for this before, but can you confirm if:
1. the task definition has logging configured? if not are you able to share the full JSON for the definition?
2. Is it a prefect registered task definition vs your own? (the revision family name would look like e.x. prefect__fe745acc-1128-45f4-a4f4-cd1630740d51__16d65ba3-038f-440c-9216-469a6c653565
if it was generated by prefect as opposed to being passed yourself)Robert Banick
04/16/2024, 4:19 PMjob_variables
such that I can customize the stream prefix, which is important for logs discovery.
2. These are task definitions we register and revise ourselves.
Just to note, the current configuration works just fine on our current Agent-based setup. Digging into the Prefect source code it seems like the Agent populates the log group, stream-prefix, etc. during the entrypoint script. I don’t strictly need to recreate that exact behavior but perhaps it will shed some light on where the breakdown is?Jake Kaplan
04/16/2024, 6:39 PMRobert Banick
04/16/2024, 6:41 PMRobert Banick
04/16/2024, 6:41 PMRobert Banick
04/17/2024, 8:04 PMJake Kaplan
04/18/2024, 1:27 PMRobert Banick
04/18/2024, 3:57 PMJake Kaplan
04/18/2024, 4:12 PMRobert Banick
04/18/2024, 4:23 PMawslogs-stream-prefix
per flow run. This vastly improves logs discovery when things go wrong.
In practice this means letting prefect register a definition since it’s linked to a prefect triggered action (a flow run).
Otherwise I can hard code the prefix (and log group) in the TD but then it’s static. I tried referencing the PREFECT_LOGGIN_EXTRA_LOGGERS
env. variable in the TD to apply a stream-prefix that corresponds to the flow run in some capacity — but I think that’s registered after the TD is applied so it can’t pick it up.Jake Kaplan
04/18/2024, 4:29 PMawslogs-stream-prefix
should default to the name of the flow run if you don't pass anything extra. I believe that is the same behavior to using agents with ECS infraJake Kaplan
04/18/2024, 4:30 PMRobert Banick
04/18/2024, 4:31 PMRobert Banick
04/18/2024, 4:31 PMawslogs-tream-prefix
blankRobert Banick
04/18/2024, 4:32 PMRobert Banick
04/18/2024, 4:32 PMJake Kaplan
04/18/2024, 4:33 PMRobert Banick
04/18/2024, 4:38 PMRobert Banick
04/18/2024, 4:38 PMJake Kaplan
04/18/2024, 4:41 PMawslogs-stream-prefix
to the name of the flow run (benevolent-pigeon
)Jake Kaplan
04/18/2024, 4:42 PMRobert Banick
04/18/2024, 4:44 PMRobert Banick
04/18/2024, 4:45 PMRobert Banick
04/18/2024, 4:45 PMRobert Banick
04/18/2024, 4:49 PM"networkMode": "awsvpc"
that currently are handled by the TD. It seems the job template provided by the workpool is not an exact 1-to-1 match with task definitions in terms of where fields are nested, so if you have any examples of a job template providing all these things it would be a big helpRobert Banick
04/18/2024, 7:45 PMJake Kaplan
04/18/2024, 9:18 PMRobert Banick
04/18/2024, 9:20 PM