https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
best-practices-coordination-plane
  • m

    Michiel Verburg

    05/19/2022, 9:46 AM
    What is the best practice for storing metadata about object locations (e.g. in various datalake zones) when using prefect to move this data around from zone to zone? We feel we lose track of where data lives or which data comes from which source after having initially processed it using a prefect Task if we do not somehow store metadata. However, we are unsure where we should be storing this metadata and if there are some best practices around it.
    ✅ 1
    a
    • 2
    • 4
  • s

    Sander

    05/19/2022, 10:00 AM
    Hi, I'm able to set up a flow and orchestrate it with a deployment spec. Now I'd like to know what the best solution is for running a backfill set of runs. Could you give some pointers?
    ✅ 1
    a
    a
    • 3
    • 11
  • b

    Binoy Shah

    05/19/2022, 1:22 PM
    Can Prefect Orion Server APIs be used to build and orchestrate workflows with Java Developer knowledge. We dont have python developers in other teams, can Java developers build their own dockerized jobs and have them scheduled/orchestrated via Prefect Orion ?
    a
    • 2
    • 16
  • j

    Jordan Parker

    05/19/2022, 11:08 PM
    Hi All, What a product! • Any tips on best practice for designing flows / sub-flows / tasks for batched web-requests in 2.0? Should tasks or sub-flows be the preferred function call for a batch? • Prefect is consuming a lot of RAM as it is fanning out to multiple web-request (i presume output / cache is stored for each request), is batching and then awaiting batches the best approach to avoid this? • Is there a way to view the parameters of a flow / task in the flow / task name in the UI? For example, for a task "collect" that performs a web request that collects id "x", it would be nice to see "collect-{id}-{task-id}" in the UI for the task run. Thanks!
    ✅ 1
    👀 1
    a
    • 2
    • 1
  • j

    Jan Domanski

    05/22/2022, 5:39 PM
    Hi everyone, What’s the recommended approach in beta to splitting up alpha/beta/gamma environment? We’re in beta so there is a single workspace per email address that’s shared (perfect@someco.com), so we can’t do different envs per workspace. One thought was a single workspace, with a single S3 storage (storage cli is so weird BTW, no ability to remove a storage) with multiple work queues. Then there would be one agent per env, each consuming a different work-queue for some isolation? Then use tags to separate which flow runs go to which work queue? What are the recommendations in terms of architecture here?
    k
    a
    • 3
    • 5
  • j

    Jacob Blanco

    05/23/2022, 7:53 AM
    Any tips on handling old images left over from a Local agent? Is there anyway for the Agent to clear those out after the flow runs? I’d rather not have to create a cron job to delete old images from the EC2 instance we are running on.
    a
    k
    • 3
    • 2
  • j

    jedi

    05/23/2022, 8:24 PM
    Is it possible to deploy or register a workflow from say agent 1 to run on agent 2? Do I need to log into host with agent 2 and register/deploy the flow from there?
    k
    a
    • 3
    • 2
  • j

    Jeffrey Lam

    05/24/2022, 4:34 PM
    Hi, if a task fails due to bad inputs from an upstream task, is there a way to modify the outputs of the upstream task before restarting the failed task? I'm using Prefect 1.2.1
    k
    • 2
    • 3
  • j

    Jacob Blanco

    05/25/2022, 2:29 AM
    We are currently using Github releases (tags more specifically) and CircleCI to publish flows into AWS ECR, we are doing this from a monolithic repository so that our data scientists can publish their flows in a standardized way without the need for them to worry about infrastructure, etc. We use a hash on the flow file to determine if the file has changed since the last release and deploy all the flows that have changed. We are running into an issue in Staging whereby one release from one person is clobbering the release made by another person (since they are working in different branches). Does anyone have a similar setup? How have you addressed the issue above? I supposed some kind of version ordering would address this issue. As in if the registered version in Cloud is > than the version to be deployed, then don’t deploy.
    ✅ 1
    s
    a
    • 3
    • 10
  • p

    Parwez Noori

    05/26/2022, 3:39 PM
    We were considering the following architecture in Azure: 1. Prefect Cloud 2. Azure container instances (as execution layer) However we have some issues when running the containers on Azure. Does anybody know any good tutorials in form of articles or videos that uses the Docker Agent with Docker? Furthermore, we are also considering the scalability of this setup in the long run. I assume that most deploy their own prefect server to kubernetes together with many pods as the execution layer. However, if we wanted to start smaller due to the cost. What could be a viable solution on Azure?
    k
    • 2
    • 2
  • s

    Slackbot

    05/26/2022, 11:04 PM
    This message was deleted.
    k
    • 2
    • 1
  • a

    Aaron Goebel

    05/26/2022, 11:11 PM
    I'm triggering a flow from the cloud UI to run on kubernetes. Kubernetes agent is authenticated. The kubernetes agent is spawning jobs, but they fail with this error:
    k
    • 2
    • 13
  • d

    Domenico Di Gangi

    05/29/2022, 10:48 AM
    Hi all! I am wondering what is the best practice, in prefect 2.0 orion, to trigger a flow run if a run of another flow fails. My concrete use case is the following: I would like to send a slack message as an alert in case a periodic flow (let's call it _flow_A_) fails. I have already set up _flow_B_ that sends a slack message using prefect-slack collection. I just need to trigger it with some input if a run of _flow_A_ fails
    a
    • 2
    • 6
  • m

    Marcin Grzybowski

    05/30/2022, 10:02 AM
    Hi, what is the best way to run "subtasks"? I know i can just use in taskA
    taskB.run()
    but then it's treated as a normal python function (so no logging, and no other prefect's magic is available for this taskB) I managed to workaround this creating a Flow in a Task, something like this:
    @task
    def taskA(config: dict):
    #...
        with Flow() as flowB:
            taskB()
    #...
        a = flow.runB()
    it seems to work - i get logs for flowB and taskB, etc I can visualize both flows using flow.visualize() - on two separate diagrams - but it's all right for me. But is this a correct approach, or should I do this some other way?
    a
    • 2
    • 13
  • s

    Shriram Holla

    05/30/2022, 2:11 PM
    Hi, We are exploring alternative options to Airflow and were wondering if Prefect is compatible with build systems. We use bazel and some of our scripts are only visible when run through bazel. Can we configure flows to run through a build system?
    a
    • 2
    • 9
  • m

    Marcin Grzybowski

    05/31/2022, 6:12 AM
    Hi, can I generate flow diagram in Prefect 2.0 from code? In 1.0 i could use flow.visualize() to get pdf with diagram.
    a
    • 2
    • 6
  • a

    Andreas Nord

    05/31/2022, 1:33 PM
    Hi! I'm running prefect agent with Azure VM, suggestions to save cost of the idle VM (I only need like 20 minutes nightly to run the flows)? Some built in functionality of Prefect that can help boot/shutdown to match with jobs
    k
    • 2
    • 7
  • m

    Marcin Grzybowski

    05/31/2022, 1:41 PM
    Hi! So I started playing with Prefect2.0 and here is my question: tasks / and flows are visible on radar - "normal" python functions are not. If I want to achieve "granularity" on the radar I have to use subflows? As I cannot run task from task ?
    k
    • 2
    • 3
  • m

    Marcin Grzybowski

    05/31/2022, 3:56 PM
    Hello again. How can I run the same Flow parallelly? I have used code from https://discourse.prefect.io/t/how-can-i-run-multiple-subflows-or-child-flows-in-parallel/96 and it works. But when I simplify it and modify to run same Flow (not different 4) then I get
    RuntimeError("The task runner is already started!")
    Modified code below:
    import asyncio
    from prefect import flow
    
    @flow
    async def subflow_1():
        print("Subflow 1 started!")
        await asyncio.sleep(1)
    
    
    @flow
    async def main_flow():
        parallel_subflows = [subflow_1(), subflow_1()]
        await asyncio.gather(*parallel_subflows)
    
    if __name__ == "__main__":
        main_flow_state = asyncio.run(main_flow())
    a
    m
    • 3
    • 12
  • j

    Jacob Bedard

    06/01/2022, 2:21 AM
    Hi folks, I've used Google Kubernetes engine a bit in a previous role, but we were running airflow on a VM and just sending jobs off to k8s, so it was pretty different than the parallel compute options we've got with Prefect. I switched to Prefect about a year ago (what a nice change), and I was most recently running a pretty simple setup with just a beefed up EC2 since my light pipelines were nowhere near hitting schedule conflicts. Now I'm looking to move my new company's pipelines over to Prefect and I need parallelization. So, looks like I need to go with ECS or EKS (we're on AWS). I haven't had to set up either of these services before. I guess my main question is: where does the agent actually sit? I get the config files etc., but this'll be my first time not running the agent on a VM, and the VM route is so simple because I just go into the machine and start the agent from the terminal. Where does the agent get initialized in the case of ECS or EKS?
    k
    • 2
    • 4
  • y

    yair friedman

    06/02/2022, 6:15 AM
    Hi , how can i get unique id for every task that is run in parallel ? if i have task A and i run it with : runs = A.map(listOfInputs, unmapped(input2)) when i print this in A:
    task_id = prefect.context.get("task_id", "")
    i have the same task_id for every task A run …..
    a
    • 2
    • 4
  • l

    Linh Nguyen

    06/02/2022, 9:00 AM
    Hi there, Have you setup flow to run with different environment e.g. cloud vs local ? We have ECS agents running flows with label prod and uat, and we would like these flows to run with different run config when running locally for development. I hope my question is clear. many thanks
    a
    • 2
    • 2
  • m

    Marco Pérez

    06/02/2022, 12:23 PM
    Hi team - have a question, we’re going to start implementing some scheduling infrastructure and I’m planning to implement Prefect, my first thought is to implement prefect 2.0 but the warning regarding “being in beta” and “don’t use for prod” flows concerns me. At the same time, I wonder if it may not be the best use of my time to learn Prefect 1.0 if 2.0 will be GA in the near future. Thoughts? Ideas? Food recommendations?
    a
    • 2
    • 3
  • s

    Siôn

    06/07/2022, 12:03 PM
    Hi Prefect Community! Does anyone have any idea how to set the Python Version for the Kubernetes Agent? Currently getting this error:
    python: (flow built with '3.10.4', currently running with '3.7.13')")
    But there doesnt seem to be any docs on how to set this python version in the agent deployment cli
    a
    • 2
    • 2
  • m

    Max Jackson

    06/07/2022, 8:51 PM
    hello all! what’s the best way to pass a Prefect keyword to a new Task class? I want to do something like
    @task(key=value)
    within a structure like
    class MyNewTask(Task): 
        def run():
             pass
    where would I pass those key/value pairs when initializing the new capital-T Task?
    k
    • 2
    • 2
  • a

    Anat Tal Gagnon

    06/08/2022, 12:59 PM
    Hi guys, we have a general questions. We are considering Prefect 2.0 (not cloud) but question about the Queue - what's the queue engine in the backend? more specifically, if we need to transfer 1000 files (or 10000) - do we need to add external queue like rabbit or kafka or can prefect handle it good? would love your 2 cents folks! thanks
    ✅ 1
    a
    • 2
    • 7
  • d

    David Arias

    06/09/2022, 1:05 AM
    hi people, I'm new in prefect • i have a instance ec2(aws) that is running in prefect cloud • basically read some data from rds(aws) and store in S3(aws) • it is scheduled every day to do this process. i am running local agent in the same instance ec2 through screen (background), but I don't think this is not a good solution, does anybody know which solution will be suitable for my case? , thanks
    ✅ 1
    a
    • 2
    • 3
  • a

    Aaron Goebel

    06/10/2022, 8:13 PM
    Some of my prefect tasks need to use a database connection. Connections are not serializable, so tasks would have to construct one on run, but this is pricy. If you have a bunch of tasks that require the connection is there a way to provide them with something like a connection pool?
    k
    • 2
    • 2
  • j

    Jonny

    06/12/2022, 12:39 PM
    Hello, I'm trying out prefect and wondering if i can use an agent to invoke a .net process which already exists?
    k
    • 2
    • 1
  • a

    Alan Snyder

    06/13/2022, 1:30 PM
    gm everyone! qq w.r.t. prefect vs temporal.io… i’m reading that temporal uses a ‘fault-oblivious’ model for running workflows - which I suppose means that it doesn’t really care how long tasks take and how many retries it takes to complete. I haven’t dug into temporal enough to know what this looks like in a real situation, but it sounds useful. Does prefect have anything like this? Are there any other guides comparing temporal and prefect? Thanks!
    ✅ 1
    a
    • 2
    • 5
Powered by Linen
Title
a

Alan Snyder

06/13/2022, 1:30 PM
gm everyone! qq w.r.t. prefect vs temporal.io… i’m reading that temporal uses a ‘fault-oblivious’ model for running workflows - which I suppose means that it doesn’t really care how long tasks take and how many retries it takes to complete. I haven’t dug into temporal enough to know what this looks like in a real situation, but it sounds useful. Does prefect have anything like this? Are there any other guides comparing temporal and prefect? Thanks!
✅ 1
a

Anna Geller

06/13/2022, 1:33 PM
I can't say anything about temporal since it's an entirely different product, but could you say more about the problem you are trying to solve and your use case?
a

Alan Snyder

06/13/2022, 1:35 PM
We’re looking to trigger workflows from one of 3 inputs… queue, api, and timer. the workflow would be semi-long running, maybe a few minutes, and makes a bunch of api and http calls, most would be async, and act on the responses. if one of those fails, we’d like to retry. Do we need to build in that retry logic?
Also - is there GoLang support for tasks?
a

Anna Geller

06/13/2022, 1:44 PM
It sounds like a great use case for Prefect if you'd like to write your logic in Python - we don't have support for GoLand and I wouldn't expect that anytime soon • we do support retries for tasks • in the future, we may add a retry/restart functionality for flows or subflows - not in the current scope but something we may consider after the General Availability of Prefect 2.0 • you can call Prefect flows from all the sources you listed since we support both scheduled runs and runs triggered via an API call - fully dynamic
a

Alan Snyder

06/13/2022, 1:49 PM
Ok cool - thanks for the info!
🙌 1
View count: 9