I'm pretty puzzled by a good way to create logging...
# ask-community
v
I'm pretty puzzled by a good way to create logging from tasks that run inside Docker containers (that reflects correctly in the Cloud UI). I have a flow spinning up a bunch of different docker containers as part of a deployment. However I am not sure what is a good way to handle logging for the individual containers through Prefect. Right now everything is running on the same machine, with a worker on Docker that runs a flow that has various tasks where docker containers are created and run using the prefect.infrastructure.container package. The logging from the containers is piped through to Prefect UI using log_prints=True. However that means there is no proper error handling as everything has the same severity, and errors are not propagated from the containers. It seems there are a few ways to handle this: • Parse the logs after a task is finished. This could work but it's not real-time and any unexpected crashes might prevent this from running as expected • Parse the logs while the container is running by running and connecting to the containers using the python docker package. This might work but I'm introducing another dependency (docker package) and this would not work when the machine is remote. It does not seem this is possible by only using the default prefect python package. • Somehow setup logging inside the container that gets parsed the relevant flow/task information from the worker, so that it can recreate a prefect logger and log everything directly to Prefect It feels like this should be easier and a ready solution should be available however after searching a lot online and trying various options I still haven't found a satisfying solution. Maybe I'm missing something about how Prefect is supposed to be set up and what the philosophy behind it is.
n
hi @Vincent Rubingh - have you tried just using
get_run_logger
? it should just work and has all the standard methods (
info()
debug()
etc)
log_prints
is just sending what you send to
print
through
get_run_logger().info()
prefect rocket 1
v
Hey Nate, thanks for your response and sorry for late reply. Yes I can get a logger inside the task like that, but that's not inside my docker container that is started up inside the task. So if I set
log_prints=false
, none of the output from inside the container (
docker_container_block
) ends up in prefect:
Copy code
@task(name="Parser", retries=1, log_prints=True)
def run__parser():

    #TODO: failures from within the docker container are not propagating to the (failed) task

    docker_container_block = DockerContainer.load("docker-container-name")
    docker_container_block.env.update(
        {
            "PREFECT_TASK_ID": runtime.task_run.id, 
            "PREFECT_TASK_NAME": runtime.task_run.name,
            "PREFECT_FLOW_ID": runtime.flow_run.id,
            "PREFECT_FLOW_NAME": runtime.flow_run.name,
            "MAX_NUM_ROUNDS": Variable.get('parser_num_rounds').value, 
        }
    )
    container_result = docker_container_block.run()

    log_result = log_parser_results(
        artifact_name="parser-report", 
        service_name="Parser", 
        service_path="parser/dev/logs/parser.log")
    
    return {}
As far as I understand, I cannot use get_run_logger inside my
docker_container_block
? That's what I was trying, so I can get the logging with the right levels to propagate correctly (especially errors).
and by inside I using
get_run_logger
in the actual docker container. That's why I was trying to forward some task_id and flow_id to the docker container, as I figured with those I can create a prefect logger in my container(s)
n
as far as when you can use
get_run_logger
, its just within the context of any task or flow run (pedantic detail: there's a
ContextVar
we set when we enter the flow or task run engine, so if you're in that context, you can use
get_run_logger
) i.e. inside a flow or task function
So if I set
log_prints=false
, none of the output from inside the container (
docker_container_block
) ends up in prefect:
log_prints will patch the builtin
print
, which I don't see used above, so its not clear to me why anything should be sent to the API as logs, not knowing how
log_parser_results
works at least
v
So I'll just make it as simple as possible to make it clear what my intention is: I have a bunch of custom docker images for a series of different services. Prefect seems like a great way to orchestrate them and run them consecutively as part of a flow. I setup a task (like the code above) that loads a docker image through a
DockerContainer
object (that's part of
prefect.infrastructure.container
). So far so good, I can get this to work, I use different tasks to run different docker containers. However, I want to somehow redirect the output from inside the docker containers (which can contain many lines of info/success/debug level output as well as errors) correctly to the Prefect UI. (Either redirect by changing the code inside the task above, or change my logging inside my docker image to log to prefect directly) So far I've only managed to redirect the output from the running docker containers by using
log_prints=false
. I haven't found another way to get for example an error to correctly propagate from inside my running docker container to Prefect. Is this a use case that's supported?
Or would it be better to for example log my docker container output to a file, and then read/parse from there inside the task?
n
so im sorry I don't have a ton of time to step through this super diligently but ill make a couple observations: • the infra blocks like
DockerContainer(...).run()
were originally added to prefect 2.x as a way of defining deployment config. they happen to be convenient wrappers for invoking work on some infra like a container, but these things are removed in 3.x and generally speaking I would recommend creating actual deployments for these pieces of work that run on their own containers, and then chain them in a parent flow, using
run_deployment
to kick them off. using these infra blocks directly is not a 1st class pattern and is likely why you're having a hard time finding info on it • its not clear to me whats happening in each container of yours rn, but generally if you're running prefect code, and you're in a task or flow context, using the
get_run_logger
should just work as long as that container has the PREFECT_API_KEY and PREFECT_API_URL set as env vars so that the API it talks to is the one you expect (again, setting these env vars just happens for you if you were to define deployments for these intermediate containers, but you'd have to inject it yourself if you use
DockerContainer
yourself directly)
v
ah ok that's very helpful, I'll switch things over to prefect 3.x style then, and put these containers as their own deployments like you mentioned.
that's exactly the kind of info I needed. Thank you!
n
sure thing, in case its a useful reference, here's a silly example that follows the general pattern im talking about
gratitude thank you 1
lastly, i'd point out for later if chaining "child" deployments becomes cumbersome, check out event
triggers
that can be defined on deployments, that way (in many cases) you can get away with not having a parent "chaining" flow at all, because each downstream deployment just has a trigger defined to
expect
the
prefect.flow-run.Completed
event from the upstream