Hey folks I am asking myself how to set the logging level to Prefect Community #ask-community

Hey folks, I am asking myself how to set the logg...

Robin

10/09/2020, 11:49 AM

Hey folks, I am asking myself how to set the logging level to DEBUG on prefect cloud when using dask-kubernetes environment on AWS EKS? We successfully added the env variable to the docker container using

env_vars={"PREFECT__LOGGING__LEVEL": "DEBUG"},

flow.storage = Docker(...)

. However, we still don't see the debug level log messages in cloud 🤔 Do we need to change anything else?

✔️ 1

nicholas

10/09/2020, 1:37 PM

Hi @Robin - sorry you're running into this. Could you try setting the logging level on your agent instead? I think this could be overwriting your Docker env vars. You can do so with the

config.toml

in your agent's environment as well:

Copy code

[logging]
level="DEBUG"

Robin

10/09/2020, 2:15 PM

Hmm, could you further elaborate how to do this on an AWS EKS cluster?

nicholas

10/09/2020, 2:21 PM

You'll need to set either the environment variable you described above or the value i mentioned in

config.toml

in your k8 agent environment (and then restart your agent)

Robin

10/10/2020, 4:10 PM

Hi 🙂 Any idea, why the above mentioned method with

env_vars={"PREFECT__LOGGING__LEVEL": "DEBUG"},

does not work on AWS EKS, dask-kubernetes cluster?

Robin

10/10/2020, 7:17 PM

I tried to set the environment variable for the prefect agent via pulumi, apparently successful (see image). However, I still don't get any debug level logging messages...

Robin

10/12/2020, 5:20 PM

Just confirming that the environment variable was successfully set on both the prefect agent and the flow docker container (see print statement from within python flow). However, the debug logs are still not printed out 😞

nicholas

10/12/2020, 5:26 PM

Hi @Robin - it's tough to tell where this might be happening without knowing your set up better; are you able to show a small reproducible example?

Robin

10/12/2020, 6:51 PM

TLDR: 1. Setup EKS cluster 2. register flow with following environment 3. run flow Expected: "debug level logging messages appear" Behavior: "only info level and error level logging appear" Is that sufficient? I can also create a prefect issue and add a minimal example flow 🤔

Copy code

flow.environment = DaskKubernetesEnvironment(
        min_workers=1, max_workers=10, labels=["k8s"]
    )

    flow.storage = Docker(...,
        env_vars={"PREFECT__LOGGING__LEVEL": "DEBUG"},
    )

Robin

10/12/2020, 7:07 PM

PS: Adding the minimal flow example. 🙂

minimal_flow.py

nicholas

10/12/2020, 7:08 PM

Thanks @Robin! Let me pass this to the Core team, I suspect there's some agent-specific issue here

Robin

10/12/2020, 7:10 PM

OK, thank you for forwarding! I am happy to help debugging in whatever way possible, as this is currently blocking us quite a bit!

Robin

10/12/2020, 7:14 PM

Digging a bit deeper, I just found this issue, that describes exactly the issue we observed: https://github.com/PrefectHQ/prefect/issues/3231

josh

10/12/2020, 7:23 PM

@Robin would you mind opening an issue for this (that links to the issue you mentioned above) with your flow and agent configuration? I’m thinking there may be something non-obvious in the env var / config setting codepath where something is overwriting the log level you are setting

Robin

10/12/2020, 7:27 PM

Actually, in issue 3231 it is described pretty plausibly, what might be the error and @josh already commented:

@alexifm Good call out! Yeah those places in population functions where the env vars are extended we could do a simple check to only add them if they are not already provided.

It seems like it should be a straight-forward solution! I can add a new issue but I think this would be a duplicate instantly, does that really make sense? Or should I just add a comment on issue 3231?

josh

10/12/2020, 7:28 PM

Ah I see, yeah add a comment on there that you are experiencing this as well, will get it worked on

Robin

10/12/2020, 8:25 PM

Done, thanks for forwarding! Waiting for prefect developers to comment on the discussion I had with alexifm 🙂

🚀 1

Jim Crist-Harif

10/13/2020, 12:37 PM

Currently the agent always sets the log-level if using an environment based configuration (the new experimental

run_config

based config does not). So for now, you'll need to set the log level in the agent's config for it to propagate to the flow runs.

Jim Crist-Harif

10/13/2020, 12:38 PM

either

Copy code

[logging]
level = "DEBUG"

in your

config.toml

on the agent, or set

Copy code

PREFECT__LOGGING__LEVEL=DEBUG

in the agents environment

Robin

10/13/2020, 3:13 PM

@Jim Crist-Harif, thanks for your feedback. Have you read the issue: https://github.com/PrefectHQ/prefect/issues/3231? It seems like the solution you describe does not work for AWS EKS due to the described misbehavior of the dask-kubernetes environment.

Robin

10/13/2020, 3:14 PM

PS: Just changed the question in the beginning of the thread, as your answer is definitely correct for most environments.

Jim Crist-Harif

10/13/2020, 3:16 PM

So #3231 is related to, but not specific to your issue (as I understand it). If you use a custom spec in the

DaskKubernetesEnvironment

, your log level will always be overriden (PR fixing that is up now). If you're not, the default behavior is to forward on the log level from the agent. Have you tried setting the log level on the agent? It should propagate through to the k8s job as currently written.

Jim Crist-Harif

10/13/2020, 3:20 PM

I just merged https://github.com/PrefectHQ/prefect/pull/3488 which fixes this, and will be part of the next release (out today/tomorrow). After that, you should be able to override the log level in the environment spec (still not in the docker image itself, as you were trying above), rather than having to override in the agent config.

Robin

10/13/2020, 3:26 PM

Have you tried setting the log level on the agent?

I actually set the environment variable when creating the prefect agent. That's what you meant right? It did not work though...

Jim Crist-Harif

10/13/2020, 3:28 PM

Hmmm, that's surprising to me. Are you using

prefect agent start

prefect agent install

Robin

10/13/2020, 3:33 PM

Thanks a lot for merging the PR! 🙏 Once the release is public, how can I set the environment variable correctly?

Copy code

flow.environment = DaskKubernetesEnvironment(
    min_workers=1, max_workers=10, labels=["k8s"], env_vars={}
)

Robin

10/13/2020, 3:34 PM

We use prefect agent start when setting it up with pulumi:

pulumi_prefect_agent.py

Jim Crist-Harif

10/13/2020, 3:36 PM

We use prefect agent start when setting it up with pulumi:

And with the above config you're not seeing the log level set as debug?

👍 1

Jim Crist-Harif

10/13/2020, 3:37 PM

Once the release is public, how can I set the environment variable correctly?

Passing in as part of the scheduler spec file should work.

Jim Crist-Harif

10/13/2020, 3:41 PM

In the future we're moving to a new config system based on a

RunConfig

class instead of

Environment

classes. In this case setting

Copy code

flow.run_config = KubernetesRun(env={"PREFECT__LOGGING__LEVEL": "DEBUG"})

would be all that's needed. This was part of the last release, but is still experimental. If you want to try it out, please let me know how it works for you.

Robin

10/13/2020, 5:45 PM

And with the above config you're not seeing the log level set as debug?

Yes, as described in the issue on GitHub, it was unsuccessful.

🤔 1

Robin

10/13/2020, 5:46 PM

This was part of the last release, but is still experimental. If you want to try it out, please let me know how it works for you.

Will try it out 🙂 Does this require any changes on our agent or so?

Jim Crist-Harif

10/13/2020, 6:20 PM

You'd need an agent running the latest version, but other than that no.

👍 1

Jim Crist-Harif

10/13/2020, 6:21 PM

Only docstring docs for now, but there's some examples in the docstring that might be useful: https://docs.prefect.io/api/latest/run_configs.html#kubernetesrun

Robin

10/13/2020, 9:15 PM

OK, I am having the agent running at 13.10 and submitted this minimal_flow.py, but it seems like flow is hanging in scheduled state 🤔

Copy code

import os
from datetime import datetime

import prefect
from prefect import Flow, task
from prefect.environments import DaskKubernetesEnvironment
from prefect.environments.storage import Docker
from prefect.run_configs import KubernetesRun

@task
def debug_logging_task():
    logger = prefect.context.get("logger")

    logger.debug("a debug message")
    logger.info(f"PREFECT__LOGGING__LEVEL: {os.environ.get('PREFECT__LOGGING__LEVEL')}")


with Flow("minimal_flow") as flow:

    debug_logging_task()


flow.environment = DaskKubernetesEnvironment(
    min_workers=1, max_workers=10, labels=["k8s"]
)

flow.run_config = KubernetesRun(env={"PREFECT__LOGGING__LEVEL": "DEBUG"})

flow.storage = Docker(
    python_dependencies=[
        # "numpy",
        # "pandas",
        # "snowflake-connector-python[pandas]==2.3.2",
        # "snowflake-sqlalchemy>=1.2.4",
        "tqdm",
    ],
    registry_url="some_repo.dkr.ecr.eu-central-1.amazonaws.com",
    image_name="minimal_flow",
    image_tag="beta_" + datetime.now().strftime("%Y%m%d_%H%M%S"),
    env_vars={"PREFECT__LOGGING__LEVEL": "DEBUG"},    
)
flow.register(project_name="eks_test_01")

Jim Crist-Harif

10/13/2020, 9:19 PM

KubernetesRun

is a replacement for environments - if a

run_config

is present on a flow, the

environment

is completely ignored. So you can drop setting

flow.environment

, and you'll need to add the

labels

to the

KubernetesRun

object (this is why it's hung in a scheduled state, since the flw labels don't match your agent).

Robin

10/14/2020, 8:11 AM

Oh that makes sense, thanks! Changed it in the minimal flow and it worked flawlessly. Will change it now in the bigger flow, we can definitely mark this thread as closed for the moment 🙂

Robin

10/14/2020, 9:54 AM

PS: As it appears, the tasks are now not run in parallel. Do I need to set any further variables to activate paralellization?

Jim Crist-Harif

10/14/2020, 12:34 PM

Yeah, the

KubernetesRun

is just for the initial job - it doesn't use dask anymore. To configure dask usage, you'd need to configure an

executor

on the flow. If you don't expect your tasks to use a high amount of memory or require a huge amount of parallelism (such that running in multiple k8s pods is worth it) to complete quickly, configuring

Copy code

flow.executor = LocalDaskExecutor(num_workers=...). # run in threads

# or

flow.executor = DaskExecutor()  # run in local processes

is what I'd recommend. Otherwise there's a bit more configuration required - can write up an example for distributed dask usage with

KubernetesRun

if you need it.

Robin

10/14/2020, 1:16 PM

Thanks for the information! We have a flow that consists of several tasks (~10) that are each mapped and run for ~ 50k systems. So I think the example for

distributed dask usage with KubernetesRun

would be very helpful!

Robin

10/14/2020, 1:22 PM

PS: Just to clarify, the two options that you mentioned above are valid for any kubernetes cluster, no matter whether local or on EKS, right?

5 Views

Open in Slack

Previous Next