Robin
10/09/2020, 11:49 AMenv_vars={"PREFECT__LOGGING__LEVEL": "DEBUG"},
in flow.storage = Docker(...)
.
However, we still don't see the debug level log messages in cloud 🤔
Do we need to change anything else?nicholas
config.toml
in your agent's environment as well:
[logging]
level="DEBUG"
Robin
10/09/2020, 2:15 PMnicholas
config.toml
in your k8 agent environment (and then restart your agent)Robin
10/10/2020, 4:10 PMenv_vars={"PREFECT__LOGGING__LEVEL": "DEBUG"},
does not work on AWS EKS, dask-kubernetes cluster?nicholas
Robin
10/12/2020, 6:51 PMflow.environment = DaskKubernetesEnvironment(
min_workers=1, max_workers=10, labels=["k8s"]
)
flow.storage = Docker(...,
env_vars={"PREFECT__LOGGING__LEVEL": "DEBUG"},
)
nicholas
Robin
10/12/2020, 7:10 PMjosh
10/12/2020, 7:23 PMRobin
10/12/2020, 7:27 PM@alexifm Good call out! Yeah those places in population functions where the env vars are extended we could do a simple check to only add them if they are not already provided.It seems like it should be a straight-forward solution! I can add a new issue but I think this would be a duplicate instantly, does that really make sense? Or should I just add a comment on issue 3231?
josh
10/12/2020, 7:28 PMRobin
10/12/2020, 8:25 PMJim Crist-Harif
10/13/2020, 12:37 PMrun_config
based config does not). So for now, you'll need to set the log level in the agent's config for it to propagate to the flow runs.[logging]
level = "DEBUG"
in your config.toml
on the agent, or set
PREFECT__LOGGING__LEVEL=DEBUG
in the agents environmentRobin
10/13/2020, 3:13 PMJim Crist-Harif
10/13/2020, 3:16 PMDaskKubernetesEnvironment
, your log level will always be overriden (PR fixing that is up now). If you're not, the default behavior is to forward on the log level from the agent. Have you tried setting the log level on the agent? It should propagate through to the k8s job as currently written.Robin
10/13/2020, 3:26 PMHave you tried setting the log level on the agent?I actually set the environment variable when creating the prefect agent. That's what you meant right? It did not work though...
Jim Crist-Harif
10/13/2020, 3:28 PMprefect agent start
or prefect agent install
?Robin
10/13/2020, 3:33 PMflow.environment = DaskKubernetesEnvironment(
min_workers=1, max_workers=10, labels=["k8s"], env_vars={}
)
Jim Crist-Harif
10/13/2020, 3:36 PMWe use prefect agent start when setting it up with pulumi:And with the above config you're not seeing the log level set as debug?
Once the release is public, how can I set the environment variable correctly?Passing in as part of the scheduler spec file should work.
RunConfig
class instead of Environment
classes. In this case setting
flow.run_config = KubernetesRun(env={"PREFECT__LOGGING__LEVEL": "DEBUG"})
would be all that's needed. This was part of the last release, but is still experimental. If you want to try it out, please let me know how it works for you.Robin
10/13/2020, 5:45 PMAnd with the above config you're not seeing the log level set as debug?Yes, as described in the issue on GitHub, it was unsuccessful.
This was part of the last release, but is still experimental. If you want to try it out, please let me know how it works for you.Will try it out 🙂 Does this require any changes on our agent or so?
Jim Crist-Harif
10/13/2020, 6:20 PMRobin
10/13/2020, 9:15 PMimport os
from datetime import datetime
import prefect
from prefect import Flow, task
from prefect.environments import DaskKubernetesEnvironment
from prefect.environments.storage import Docker
from prefect.run_configs import KubernetesRun
@task
def debug_logging_task():
logger = prefect.context.get("logger")
logger.debug("a debug message")
logger.info(f"PREFECT__LOGGING__LEVEL: {os.environ.get('PREFECT__LOGGING__LEVEL')}")
with Flow("minimal_flow") as flow:
debug_logging_task()
flow.environment = DaskKubernetesEnvironment(
min_workers=1, max_workers=10, labels=["k8s"]
)
flow.run_config = KubernetesRun(env={"PREFECT__LOGGING__LEVEL": "DEBUG"})
flow.storage = Docker(
python_dependencies=[
# "numpy",
# "pandas",
# "snowflake-connector-python[pandas]==2.3.2",
# "snowflake-sqlalchemy>=1.2.4",
"tqdm",
],
registry_url="some_repo.dkr.ecr.eu-central-1.amazonaws.com",
image_name="minimal_flow",
image_tag="beta_" + datetime.now().strftime("%Y%m%d_%H%M%S"),
env_vars={"PREFECT__LOGGING__LEVEL": "DEBUG"},
)
flow.register(project_name="eks_test_01")
Jim Crist-Harif
10/13/2020, 9:19 PMKubernetesRun
is a replacement for environments - if a run_config
is present on a flow, the environment
is completely ignored. So you can drop setting flow.environment
, and you'll need to add the labels
to the KubernetesRun
object (this is why it's hung in a scheduled state, since the flw labels don't match your agent).Robin
10/14/2020, 8:11 AMJim Crist-Harif
10/14/2020, 12:34 PMKubernetesRun
is just for the initial job - it doesn't use dask anymore. To configure dask usage, you'd need to configure an executor
on the flow. If you don't expect your tasks to use a high amount of memory or require a huge amount of parallelism (such that running in multiple k8s pods is worth it) to complete quickly, configuring
flow.executor = LocalDaskExecutor(num_workers=...). # run in threads
# or
flow.executor = DaskExecutor() # run in local processes
is what I'd recommend. Otherwise there's a bit more configuration required - can write up an example for distributed dask usage with KubernetesRun
if you need it.Robin
10/14/2020, 1:16 PMdistributed dask usage with KubernetesRun
would be very helpful!