On 1.0, is there a way to run a ShellTask where the stdout/stderr is streamed in its raw format, byp...

Billy McMonagle

07/21/2022, 6:15 PM

On 1.0, is there a way to run a ShellTask where the stdout/stderr is streamed in its raw format, bypassing the prefect logger?

✅ 1

Billy McMonagle

07/21/2022, 6:16 PM

I am using the DbtShellTask and would like to run my dbt commands with the

--log-format json

option, so that the json-formatted logs will be shipped to our logging service and be easier to read. Unfortunately, I'm not sure that this is possible due to this code

Billy McMonagle

07/21/2022, 6:17 PM

The logs come through like this, which is obviously not ideal

Anna Geller

07/21/2022, 6:41 PM

sorry to see this issue, Billy - would you want to look at contributing a fix or finding a workaround? we won't be able to look at this in 1.0, but we may investigate more in the prefect-dbt collection for 2.0

Billy McMonagle

07/21/2022, 6:44 PM

I totally understand, I just found out about this option. I'd really love to see it in 2.0. If I'm able to come up with a solution I'll try to make a contribution. Thanks!

🙌 1

Anna Geller

07/21/2022, 6:45 PM

absolutely and really sorry I can't dive deeper to help but things are incredibly busy with 2.0 GA release planned for Wednesday next week

Billy McMonagle

07/21/2022, 6:46 PM

Good luck! I'm excited for the release and will be closely watching.

❤️ 1

Billy McMonagle

07/27/2022, 7:54 PM

Just to close the loop, after a bit of trial and error I finally found the workaround. The tricky bit was figuring out how parent/child loggers work... I couldn't figure out why the task logger didn't have any handlers for me to modify, but it's because child loggers are created with no handlers of their own, you must reference the parent's handlers directly.

Copy code

shell_task = DbtShellTask(
    name="DBT",
    profile_name=APP,
    environment=DBT_ENVIRONMENT,
    stream_output=True,
    log_stderr=True,
)
# emit dbt json-formatted logs directly for friendly display in datadog.
formatter = logging.Formatter("%(message)s")
for handler in shell_task.logger.parent.handlers:
    if not isinstance(handler, CloudHandler):
        handler.setFormatter(formatter)

The task is then executed like this (showing only the most relevant code):

Copy code

@task
def execute_dbt(
    command,
    schema=None,
    select=None,
    exclude=None,
    full_refresh=False,
    **kwargs,
):
    dbt = init_dbt(schema)
    if select:
        command += f" --select {select}"
    if exclude:
        command += f" --exclude {exclude}"
    if full_refresh:
        command += " --full-refresh"
    dbt.run(command=command, *kwargs)

dbt_build = execute_dbt(
    task_args={"name": "dbt Build"},
    command=f"dbt --log-format json build",
    select=select,
    exclude=exclude,
    full_refresh=full_refresh,
    upstream_tasks=[dbt_dependencies],
)

🙌 1

🙏 1

Anna Geller

07/28/2022, 12:34 AM

this is awesome, thanks so much for sharing

Jacob Blanco

08/16/2022, 6:14 AM

Sorry to hijack the thread but I’m curious @Billy McMonagle if you wouldn’t mind sharing how you are getting the Prefect logs into DataDog.

Anna Geller

08/16/2022, 10:50 AM

@Jacob Blanco Billy shared his approach here https://github.com/PrefectHQ/prefect/discussions/5142

🦜 1

🙏 1

👍 1

Open in Slack

Previous Next

Prefect Community

Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.