prefect-community #prefect-community

Can a state handler launch another prefect task? My end goal is to handle failed tasks via another task. Right now I've implemented everything in a state handler, but I don't get Prefect logging/tracking like I do with a task.

Kyle McEntush

08/24/2020, 7:30 PM

Two quick questions: within a flow (using the imperative API), how can I view reference tasks? And how can I set a task to be a reference task or not? My code currently consists of using

flow.add_task()

and

task.set_upstream()

. Specifically, I want to make sure that any of my triggers are dependent only the tasks that I think they really are. For example, in my pipeline (image attached), I want the trigger for the next task that will come off of the valid_unit_reducer to be triggered by

any_successful

for the valid_unit_reducer and not by

any_successful

on the invalid_unit_reducer. Maybe this is the default behavior in Prefect, but my current understanding is that triggers are related to all tasks and not just the upstream task immediately before it

Minakshi

08/25/2020, 12:50 AM

Hi all, i am getting this error

ModuleNotFoundError: No module named 'dask.system'

while importing

File "/Users/mkorad/PycharmProjects/altruistic-armadillo/src/**", line 1, in <module>

from prefect import task, Flow, Parameter

Any idea about this error?

Alfie

08/25/2020, 4:15 AM

hi, is there an easy way to figure out agents connecting to apollo? I cannot get my agent pick up a flow run, so suspecting that the flow run is picked up by some other agent started unintenciously. Thanks

Robin

08/25/2020, 8:42 AM

Hey there, Is there a general way of how to troubleshoot kubernetes agents? I spun up an

EKS cluster

using

pulumi

and created an agent locally using

prefect agent start kubernetes --token <token_id> --label k8s

. It seems like the kubernetes agent is running correctly (see attached images). However, when I submit a flow that has the same label

k8s

, it does not execute (see image)… 1. How do I make sure the kubernetes agent is set up properly? 2. Which tests does prefect cloud already perform by itself, to make sure that an agent is set up properly?

William Smith

08/25/2020, 10:11 AM

Hi all, I'm having an issue with a task that has a manual_only trigger, when I resume said task it only resumes after ~10 mins when the Lazurus process picks it up. I would have expected it starts straight away. It works fine if I run the flow on my local machine however if I run it in the cloud then it becomes an issue. I've tried adding a LocalResult to my flow but this hasn't worked, here is a pastebin with a very simple flow so hopefully you can reproduce the issue: https://dpaste.com/9DRBEHZXK

Manuel Mourato

08/25/2020, 2:10 PM

Hello all Apologies if this is a basic question, but I am trying to checkpoint the output of a task, like so:

Copy code

from prefect.tasks.shell import ShellTask
from prefect.engine.results import LocalResult
from datetime import timedelta
from default_task_handler import tasks_notifications_handler
import os

os.environ["PREFECT__FLOWS__CHECKPOINTING"] = "true"

a=ShellTask(max_retries=3, retry_delay=timedelta(minutes=60), timeout=1800,
                                state_handlers=[tasks_notifications_handler],checkpoint=True, result=LocalResult(dir="/home/my-user/weekly_execution"),
                                command="ls /home/my-user/")
a.run()

The task runs, and the

weekly_execution

directory is created, but nothing is persisted. What am I doing wrong? Is it mandatory that the task be part of a flow? Thank you UPDATE Indeed if I run the task inside a flow, checkpointing works. Is there a way to do it for individual tasks?

William Smith

08/25/2020, 2:30 PM

Are there any examples of calling the LambdaInvoke task? Can't seem to get it working....

Lukas

08/25/2020, 3:38 PM

I'm running a flow via Fargate agent and all of a sudden getting this message in the Fargate logs

[2020-08-25 15:34:14] DEBUG - prefect.CloudFlowRunner | Flow 'Fetch-Authors': start_time has not been reached; ending run.

. The flow is

Submitted for execution

but fargate basically shuts down and stops the task. Any idea why this could happen? I ran this flow lot of times before, never experienced this.

Jason Oban

08/25/2020, 5:01 PM

Is there a way to programmatically add and remove extra loggers or can extra loggers only be configured via environment variables or the prefect config?

Kyle McEntush

08/25/2020, 6:26 PM

Is there a way to do a full-stop if a task fails? In my graph, the mapped

validate_unit

tasks all get

TriggerFailed

. My

invalid_unit_reducer

is set to trigger for any failures. Is there a way I can set it to trigger for any failures that aren't a trigger failure, but rather a hard-fail? In my setup, a trigger failure should be treated differently than a true failure.

Kyle McEntush

08/25/2020, 6:26 PM

Apologies for the second post. Forgot to include the image

Marwan Sarieddine

08/25/2020, 7:39 PM

Hi folks - I am running into API errors trying to poll the status of a flow run - seems like a Cloud API issue - anyone else facing similar issues ?

Copy code

status = client.get_flow_run_info(flow_run_id)
  File "~/.pyenv/versions/3.7.7/envs/etl/lib/python3.7/site-packages/prefect/client/client.py", line 990, in get_flow_run_info
    result = self.graphql(query).data.flow_run_by_pk  # type: ignore
  File "~/.pyenv/versions/3.7.7/envs/etl/lib/python3.7/site-packages/prefect/client/client.py", line 281, in graphql
    retry_on_api_error=retry_on_api_error,
  File "~/.pyenv/versions/3.7.7/envs/etl/lib/python3.7/site-packages/prefect/client/client.py", line 237, in post
    retry_on_api_error=retry_on_api_error,
  File "~/.pyenv/versions/3.7.7/envs/etl/lib/python3.7/site-packages/prefect/client/client.py", line 373, in _request
    token = self.get_auth_token()
  File "~/.pyenv/versions/3.7.7/envs/etl/lib/python3.7/site-packages/prefect/client/client.py", line 503, in get_auth_token
    self._refresh_access_token()
  File "~/.pyenv/versions/3.7.7/envs/etl/lib/python3.7/site-packages/prefect/client/client.py", line 630, in _refresh_access_token
    token=self._refresh_token,
  File "~/.pyenv/versions/3.7.7/envs/etl/lib/python3.7/site-packages/prefect/client/client.py", line 294, in graphql
    raise ClientError(result["errors"])
prefect.utilities.exceptions.ClientError: [{'path': ['refresh_token'], 'message': 'Unable to complete operation', 'extensions': {'code': 'API_ERROR'}}]

Marwan Sarieddine

08/25/2020, 8:19 PM

hmm - the prefect cloud UI is not reflecting the correct status of the task runs - I see from the logs that I have tasks that are going into a mapped or cached state but the gantt chart only shows pending Task Runs for some reason - even after refreshing the flow-run page (please see the attached screenshot) -

👀 1

An Hoang

08/25/2020, 11:07 PM

How is everyone versioning the output based on which version of the flow/Task created it? I was thinking of recreating this approach in Prefect, where the

Task

object has a version and a hash that is a combination of the version of the current Task object and all Task before it. The output file will contain the final/flow's hash. Any ideas on how to approach this differently?

Riley Hun

08/26/2020, 1:19 AM

Hi everyone, Hope you're having a good week so far. Pardon my ignorance, but just a quick question - I am using a static dask cluster hosted on Google Kubernetes Engine on GCP and I have it set to auto-scale depending on CPU usage. Now, I am running a mapped task to unzip and extract some files from several zip folders, but I'm noticing that the number of Dask workers remain at 1. I guess my question is should I expect Prefect mapped tasks to spawn additional Dask workers to accomplish the parallelization? Checking the logs, it looks like it's extracting the files sequentially even though I am using the mapped task.

Howard Cornwell

08/26/2020, 2:12 PM

Hey, been searching for a while but having no luck; what’s the correct environment variable for configuring the prefect server db host? I’ve tried

PREFECT__SERVER__DATABASE__HOST

but it doesn’t appear to work. Thanks

Slackbot

08/26/2020, 4:38 PM

This message was deleted.

Paweł

08/26/2020, 6:11 PM

Hello all prefect folks! I have a quick question: can i somehow "force" prefect-ui to use network of container in which its deployed? I deployed whole stack on k8s and right now i need to forward apollo to my localhost to use ui ( i dont want to expose anything without auth)

josh

08/26/2020, 7:47 PM

Hey team, Prefect version

0.13.4

has been released and here are a few notable changes: 📚 New databricks task library task (I couldn’t find the brick emoji) 🕵️ Custom YAML for k8s agents 🔗 Coupled versioning for Core / Server / UI A big thank you to our contributors who helped out with this release! Full changelog:

Untitled

P 3

🚀 6

🦜 1

Rob Fowler

08/26/2020, 10:34 PM

I have an API question. If I have a flow that uses the same task a few times, how can I get the task results on the state when it finishes if the result dictionary is by the task name?

Rob Fowler

08/26/2020, 10:42 PM

here is a non working example but what I mean

Copy code

from prefect import Flow, task, Parameter
from flows.freq import request

@task(name="Creating an order")
def create_order(opts, user_agent, auth, js):
    return request(opts, 'POST', f"{opts.try_url}/orders", user_agent, auth, js)

def create_service_requests(delivery_type):
    return {'dt': delivery_type}

with Flow("azure_subscription_change_smacc") as flow:
    opts = Parameter('opts')
    user_agent = Parameter('user_agent')
    auth = Parameter('auth')
    regular_order_id_js = create_service_requests("Regular")
    accelerated_order_id_js = create_service_requests("Accelerated")
    regular_result = create_order(opts, user_agent, auth, regular_order_id_js)
    accelerated_result = create_order(opts, user_agent, auth, accelerated_order_id_js)

if __name__ == '__main__':
    state = flow.run(opts={}, user_agent="blah", auth="blahauth")

Rob Fowler

08/26/2020, 10:44 PM

Ideally, I'd like to get the regular_result and accelerated_result without using an index like state.result[0] etc

Rob Fowler

08/26/2020, 11:12 PM

ok got it, if I have the task result objects I can use them as an index into results:

Copy code

def check(state):
    print(f"regular result: {state.result[regular_result].result}")

👍 2

Bob Colner

08/27/2020, 3:17 AM

After upgrading from prefect core

0.11.2

0.13.4

my flow is failing with an error related to 'Cloud' -which is strange since I'm not using prefect cloud (or local server/UI). Looks like it is related to the slack notifications. Any ideas? (full logs posted in the thread)

Alfie

08/27/2020, 3:58 AM

Hi Team, now I’m using local storage to store flows in a local storage. But to my case, it’s a proper solution to store the flow into db. Any guides to achieve on this? Thanks

Sandeep Aggarwal

08/27/2020, 1:32 PM

Hello Team, Looks like

delete_flow_run

mutation is clashing with Hasura's auto-generated mutation schemas. I am self hosting Prefect and as part of data retention policy, need to cleanup old data objects. I am using below mutation to cleanup old flow/task runs:

Copy code

mutation($created_before: timestamptz) {
            delete_flow_run(where: {created: {_lt: $created_before}}) {
                affected_rows
            }

            delete_flow_run_state(where: {created: {_lt: $created_before}}) {
                affected_rows
            }

            delete_log(where: {created: {_lt: $created_before}}) {
                affected_rows
            }

            delete_task_run(where: {created: {_lt: $created_before}}) {
                affected_rows
            }

            delete_task_run_state(where: {created: {_lt: $created_before}}) {
                affected_rows
            }
        }

The request fails with below error:

Copy code

2020-08-27T12:40:17.586Z {"message":"Unknown argument \"where\" on field \"delete_flow_run\" of type \"Mutation\".","locations":[{"line":3,"column":37}],"extensions":{"code":"GRAPHQL_VALIDATION_FAILED"}}
2020-08-27T12:40:17.586Z {"message":"Cannot query field \"affected_rows\" on type \"success_payload\".","locations":[{"line":7,"column":9}],"extensions":{"code":"GRAPHQL_VALIDATION_FAILED"}}
2020-08-27T12:40:17.586Z {"message":"Field \"delete_flow_run\" argument \"input\" of type \"delete_flow_run_input!\" is required, but it was not provided.","locations":[{"line":3,"column":21}],"extensions":{"code":"GRAPHQL_VALIDATION_FAILED"}}

When I remove

delete_flow_run

from above mutation, everything works fine.

Zach

08/27/2020, 2:33 PM

I'm getting some odd UI behavior when trying to change the "Items per page" on the Gantt Chart view of a run in Prefect Cloud

Jason Nochlin

08/27/2020, 5:42 PM

Quick question on best practices: I'm trying to use "Secrets" with a ShellTask and see two possibilities: 1. Get the Secret when I register the flow, eg:

Copy code

from prefect import Flow
from prefect.tasks.shell import ShellTask
from prefect.client import Secret

environment = {}
secret_key = Secret("SECRET_KEY")
environment['SECRET_KEY'] = secret_key.get()

with Flow(name, schedule=schedule) as flow:
  task(command='./do-the-thing', env=environment)
  flow.register(project_name=project_name)

2. Use

prefect.client.Secret

to get the Secret from within the Task when it starts (similar to how an entrypoint script is often used to set environment variables in Docker environments). eg:

Copy code

# register-tasks.py
from prefect import Flow
from prefect.tasks.shell import ShellTask

with Flow(name, schedule=schedule) as flow:
  task(command='./do-the-thing')
  flow.register(project_name=project_name)

# do-the-thing
#!/usr/bin/env python3
from prefect.client import Secret

secret_key = Secret("SECRET_KEY")
os.environ['SECRET_KEY'] = secret_key.get()

Is one of these a recommended over the other as the "best practice" for Prefect?