prefect-community #prefect-community

[v1.2.1] Hi I would like to run multiple sub flows from task (e.g for each ref I would like to run each action "read", "write","delete") . I have the parent flow who'll have the basic data and the child flow who'll execute an action on a ref. With my code I can only trigger 3 subflows / 9 . I think its because I use the StartFlowRun (I get the same behavior with create_flow_run ) because if I call a task instead of startflowrun I get my 9 subflow executed.

Copy code

from prefect import Flow, Parameter, task, unmapped
from prefect.tasks.prefect import create_flow_run, wait_for_flow_run
from prefect.executors import LocalDaskExecutor

from prefect.tasks.prefect import StartFlowRun

import time
from datetime import timedelta

# @task
# def wait_and_succeed(ref, action_id):
#     time.sleep(10)

#     print(f"children task success for ref {ref} and action {action_id}")

#     if action_id == "write":
#         print(f"[SUCCESS] {ref} Second level reached !!!")
#     if action_id == "delete":
#         print(f"[SUCCESS] {ref} Third level reached !!!")

@task
def call_children_flow(ref):
    print(f"{ref} ref")

    actions_id = ["read","write","delete"]

    for action_id in actions_id:
        start_flow_run = StartFlowRun(flow_name="Generic Children flow")
        print(f"start_flow_run {start_flow_run}")

        child_id = start_flow_run.run(parameters={
                "reference": ref,
                "action_id": action_id,
        }, project_name="Playground")

        wait_for_flow_run.run(child_id)


@task
def run_action(action_id, ref):
    start_flow_run = StartFlowRun(flow_name="Generic Children flow")

    print(f"start_flow_run {start_flow_run}")

    child_id = start_flow_run.run(parameters={
            "reference": ref,
            "action_id": action_id,
    }, project_name="Playground")

    return child_id


with Flow("Generic Parent flow") as parent_flow:
    fake_refs = ["ref1", "ref2", "ref3"]
    call_children_flow.map(fake_refs)

if __name__ == "__main__":
    parent_flow.register(
        project_name="Playground"
    )

    parent_flow.executor = LocalDaskExecutor(num_workers=20)

    parent_flow.run()

Ievgenii Martynenko

07/11/2022, 2:16 PM

Did any noticed how many db session prefect 1.0 has at a point of time? I spot it creates sessions per every request leading to enormous numbers 100-200+ active session.

✅ 1

Abin Joseph

07/11/2022, 2:22 PM

When would be stable release of prefect 2?

Madison Schott

07/11/2022, 2:40 PM

Hi all, for some reason my pipeline has been failing for a few days, but I never received an alert on this in our Slack channel- any ideas why this would be? I want to make sure this doesn't happen going forward

Jehan Abduljabbar

07/11/2022, 3:37 PM

What is the docker build command for the Dockerfile at https://github.com/PrefectHQ/prefect/blob/master/Dockerfile

Mars

07/11/2022, 4:06 PM

Hi all, is the Prefect Context object intended to be used for all pipeline configuration, including bespoke settings like bespoke API endpoints, environment names, and such? The examples in the Context Concept docs don’t make it clear if the context object is just for the Prefect framework’s configuration, or for my pipeline’s custom configuration too.

Amol Shirke

07/11/2022, 5:38 PM

Hello, I have prefect server behind load balancer interface. Deployment are created successfully with CLI but those don't. Show up on UI. Here is how I am running prefect orion. Prefect version: 2.0b8 $prefect orion start --host 0.0.0.0 --port 8065 $ set config: prefect config set http://0.0.0.0:8065/apiapi To access that host address is like http://hostname:some_random_portport UI is getting loaded but deployment is blank. CLI shows it. I tried setting up config to hostname with random port as well but didn't work. Any thoughts? Thanks

Michał Augoff

07/11/2022, 6:00 PM

hi all, 2.0 question, is it possible to set certain k8s job properties at the agent level (e.g. service account, namespace) as it is possible in 1.0 or everything needs to be configured via

Deployment

’s KubernetesFlowRunner properties?

Joe Goldbeck

07/11/2022, 7:15 PM

Hi all: Using Prefect 1.x, is it possible to have timeout/expiration for late flow runs? We run an export job every 15 minutes, so if our agent goes down, these can accumulate rather quickly. When the agent comes back up, we do not want to go back and run all the missed scheduled runs, we just want to pick back up from wherever it is.

Connor Parish

07/11/2022, 7:43 PM

Hi all, I am trying to run a sub-flow on a different docker image than my main flow and pass data back from my sub-flow to my main flow using

prefect 2.0b7-python3.8

. I'm trying to orchestrate the images through

DeploymentSpecs

with

DockerFlowRunners

as the flow_runners. Currently what happens is the sub flow runs on the same image as the main flow unless the sub flow is deployed independently. Would greatly appreciate any ideas or insights into current feasibility!

✅ 1

Vaikath Job

07/11/2022, 8:29 PM

Hi, I'm trying to attach storage to my flows (GitLab with on-prem host). Prefect server is hosted on an on-prem K8s cluster. I get the following error when trying to use an OAuth token with the Secrets API. i.e. The config.toml is located on my local machine with and has a section like this [context.secrets] section with my token stored in a variable named GITLAB="<OAuth Token>"

flurven

07/11/2022, 8:34 PM

Getting error when importing library prefect_gcp. On prefect 2.0. NameError: name 'SecretManagerServiceClient' is not defined.

Vaikath Job

07/11/2022, 8:36 PM

Hi, I'm trying to attach storage to my flows (GitLab with on-prem host). Prefect server (1.x) is hosted on an on-prem K8s cluster. I get the following error

Copy code

Failed to load and execute flow run: ValueError('Local Secret "<prefect.client.secrets.Secret object at 0x00000221D3B7B100>" was not found.')

when trying to use an OAuth token with the Secrets API. i.e. The config.toml is located on my local machine with and has a section:

Copy code

[context.secrets]
GITLAB="<OAuth Token>"

The code that registers the flow is similar to this:

Copy code

secret = Secret("GITLAB")
flow.storage = GitLab(host="path/to/host", repo="repo/address", path="flow/sample_flow.py", access_token_secret=secret)
flow.register(project_name="test-project-name")

I assume this is happening because the config.toml is not on the K8s cluster. If this is the case, is there a way I can attach this storage to the flow without storing OAuth tokens on the cluster itself?

Kevin Grismore

07/11/2022, 9:05 PM

Every time I run a flow, a weird cloudpickle-encoded blob JSON gets uploaded to my GCS flow storage bucket. 🤔 Is that supposed to happen? I thought my storage Block was just for when I'm deploying a flow or reading it from an Agent, but I probably misunderstood what else goes in there.

✅ 1

Mars

07/11/2022, 9:09 PM

Hi, is there an easy way to use

.env

files to load secrets from

os.environ

after

prefect

module import time? I want to use a

PrefectSecret

instead of an

EnvVarSecret

in my code, and I don’t want to hack the code between Prefect/EnvVar for local dev. Local context secrets should work well for overriding the PrefectSecret values, but it’s not working the way I expect. My debugger is telling me the context and secrets are set once, during

import prefect

, which means the secrets are fixed before I can load a dotenv file using my library of choice. The following pseudocode doesn’t work:

Copy code

import prefect
import environs  # .env support. .env not loaded yet.

with Flow() as flow:
  PrefectSecret("MY_SECRET")

if __name__ == "__main__":
  # For local testing
  env = environs.Env()
  env.read_env(".env")  # Load my custom PREFECT__CONTEXT__SECRETS into os.environ
  flow.run()  # Ignores new os.environ

✅ 1

Laxman Singh Tomar

07/12/2022, 5:01 AM

Hello everyone. I have multiple microservices/projects (look at attached image) for use cases like Q&A generation, Search, Data Ingestion, etc. If we were to provide devs the ability to combine these individual components to stitch together as a service, would Prefect be of help here?

Andreas Nigg

07/12/2022, 8:41 AM

Hi all, I've encountered a prefect 2.0 (cloud) problem: I've a simple a flow which has a single task which looks as follows:

Copy code

@task(name="get_subscriptions",
retries=2,
retry_delay_seconds=5)
def get_subscriptions(paper_code, logger: Logger):
    response = requests.get("my_url")
    return response

The request itself works fine, if I run it manually. However, as soon as I use prefect 2.0 (with prefect 2.0 cloud) to run the flow/task, I run into to following exception. The get request in the task takes about 1 minute and 10 seconds to return. The exception itself is not coming from the server or my client --> I changed my request.get() call in the task to a http.client request but still get the request-exception below - so I've the strong feeling it's somehow related to prefect. Exception summary: • requests.exceptions.ConnectionError: ('Connection aborted.', timeout('The write operation timed out')) • followed by: 103655.875 | ERROR | Flow run 'chocolate-starling' - Crash detected! Request to https://api-beta.prefect.io/api/accounts/bd169b15-9cf0-41df-9e46-2233ca3fcfba/workspaces/f507fe51-4c9f-400d-8861-ccfaf33b13e4/task_runs/29d89dc3-4d92-4c69-a143-44f164303819/set_state timed out. Exception details: See in thread Is there something wrong in how I use the requests module? Or is there a "hidden" timeout for prefect when a prefect-scheduled task runs for more than 1 minute? Edit: I run the flow currently only locally by running "python name_of_script.py" Edit2: I'm running the python env in WSL2 Edit3: I use GCS storage as my default storage. Maybe this causes the problem? Edit4: I was able to work around the issue, by zipping the content of the response before returning it in my flow. So if I change my flow to the following, it works. For me it looks really, as if the upload to GCS has a timeout of 1 minute and therefore the whole flow breaks, if the upload takes longer than this minute. I can live with this workaround for the moment, however I'd be happy to know, if my "theory" about GCS being the problem is correct.

Copy code

@task(name="get_subscriptions",
retries=2,
retry_delay_seconds=5)
def get_subscriptions(paper_code, logger: Logger):
    response = requests.get("my_url")
    return zlib.compress(response.content)

✅ 1

Emil Østergaard

07/12/2022, 10:01 AM

Hello, I have problems with prefect cloud 2.0. We use kubernetes flow runner, and a dask task runner. Friday (8/7-2022), I had a flow run which I wanted to abort. I attempted to use the

delete

functionality in the UI, thinking it would delete all resources related to the flow_run, including the kubernetes job etc. It did not remove the kubernetes job, so I removed this manually. The issue is concurrency-limits: The tasks launched by this flow has a tag, with a concurrency limit. It appears the task data associated with the deleted flow run was not removed from prefect storage. For instance, if I try:

Copy code

prefect concurrency-limit inspect my-tag

It shows a bunch of active task ids, even though nothing is running in k8s. This causes an unfortunate issue where any new flow runs, for this flow, will never start tasks, because prefect thinks the concurrency-limit is hit, due to these zombie tasks. However, I can not seem to find a way to manually clean up these task ids, which means this flow is dead. Any help is appreciated!

✅ 1

Slackbot

07/12/2022, 10:16 AM

This message was deleted.

iñigo

07/12/2022, 10:31 AM

Hello, Is it possible to have a global view of all scripts? in the calendar view you are only seening the schedules for just one, But as I have one times a day scripts I'd love to see them chronologically to see how they are executing

✅ 1

iñigo

07/12/2022, 10:35 AM

Hello again, I was wondering if is possible or not, or if it is a good procedure tu create a flow that runs some flows inside. For example we have 3 flows that gather some data every night, what I do is to schedule every flow independtly and then space them one another. Will it be interesting to create a flow that manages all of these 3 flows? and I'll just have to schedule one? Thanks!

✅ 1

07/12/2022, 11:48 AM

Hello, is there a way to get a message inserted into a queue upon the

flow run's success or failure

, without using a task? Using a task is susceptible to infra issues (pre-emptible cloud compute nodes, etc) and hence will miss firing. Not sure if there’ a way to configure each flow to send a status into a AWS-SQS or GCS-PubSub? Or if not, have an HTTP api to poll for status for all of the flows in a tenant account? Thanks!

✅ 1

Andreas Nigg

07/12/2022, 12:50 PM

Hey, I'm using prefect 2.0 cloud. First of all, congratulations to this huge release some days ago. A lot of highly valuable features for our company. Thanks 👍 One thing I'm currently missing: How can I access the new "Notifications"? I'm using prefect 2.0 cloud, but there is no button for Notifications. Cookies/Browser-Cache was deleted.

✅ 1

Joshua Greenhalgh

07/12/2022, 2:07 PM

Hey - I am wanting to terminate the usage of an API token - but the page says these are depreciated and to use service accounts - this is fine however I cannot see a way of revoking an existing one?

✅ 1

Alan Ning

07/12/2022, 2:47 PM

Hi, I am using Prefect 1.0. I am wondering if there is a way for a flow to time out if it can't find an agent to run on after a period of time.

Jacob Bedard

07/12/2022, 3:33 PM

I've got a virtualenv set up on an EC2 and I'm having trouble getting a flow to run from the UI. I'm seeing this error for a package that I've installed on the env already:

Failed to load and execute flow run: FlowStorageError('An error occurred while unpickling the flow:\n  ModuleNotFoundError("No module named \'snowflake\'")

I'm running this as a local agent. The flow runs ok when I do a flow.run() on the machine running the agent. What am I missing here?

sravani jammula

07/12/2022, 3:52 PM

@here Hi I have a query regarding context when I try to use the following content in message block ,I'm seeing error "ERROR | Unexpected error occured in FlowRunner: AttributeError("'Context' object has no attribute 'task_name'") I tried to import context library as well but it is saying it is not used not sure what wrong I'm doing here Content in message block : f"The task

{prefect.context.task_name}

failed

jack

07/12/2022, 4:16 PM

Is there a straighforward way for a task to know when the flow-run started? Wanting to include this in a slack notification.

✅ 1

Marcin Grzybowski

07/12/2022, 4:39 PM

Hi, is anyone familiar with this error? : ``````

✅ 1