https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
best-practices-coordination-plane
  • j

    Josh Cowles

    07/21/2022, 8:17 PM
    Hi there everyone! I am wondering what people have been using for passing data between prefect 2 tasks if the data is a tf.dataset or one of it's subtypes, does anyone have any experience with trying to pass those through tasks? Or would I be better off simply using the tensorflow provided methods for saving and loading to a file, and just use prefect results for file paths etc?
    k
    • 2
    • 3
  • h

    Ha Pham

    07/22/2022, 3:08 AM
    Hi, I'm looking into Prefect 2.0 and have questions about managing secrets / credentials of external services. In 1.0 local development, I can set them in the
    config.toml
    file, or put them in an .env file. In 2.0 looks like this is replaced by the
    Block
    concept. So: • Looking at Block's doc, I think currently the only way for me to register a certain secret is to write it as a Block, turn it into a script, then run it. Is this correct? • It is said that the Block's content will be saved in the db (default to SQLite). does this mean if I want to see the content outside of prefect's development workflow, I have to query the db directly? • How do I modify & delete saved secrets? • What's the best way to manage environment variables?
    ✅ 1
    a
    • 2
    • 1
  • r

    Rohit

    07/23/2022, 1:42 PM
    Hello! Let's say I have a python script deploy.py that takes the best model from MLflow model registry and deploys it as a flask application. Is there a simple way I can add this script as a task in a prefect flow? Something like this:
    @task
    def deploy_best_model():
          python deploy.py
    ✅ 1
    a
    • 2
    • 5
  • s

    Surawut Jirasaktavee

    07/23/2022, 1:46 PM
    Hi, I need help. I got the below error since I ran the command
    prefect storage ls
    or
    prefect storage create
    . How can I solve this?
    ✅ 1
    a
    • 2
    • 16
  • o

    Owais Farooqui

    07/24/2022, 3:43 PM
    i am getting this
    sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) table flow already exists
    when I try to run
    prefect storage ls
    OR
    prefect storage create
    i am using prefect==2.0b6
    a
    • 2
    • 1
  • k

    Kun Yin

    07/25/2022, 6:24 AM
    Hi! When I run the example code after successfully logged in my prefect cloud (prefect version == 2.0b9), I got the error : prefect.exceptions.PrefectHTTPStatusError: Client error '422 Unprocessable Entity' for url 'https://api-beta.prefect.io/api/accounts/f68f7059-5eec-4e71-acc8-d7f351373e43/workspaces/ada6556d-d542-4378-9b90-b5647a4dd404/flow_runs/'. How can I solve this?
    🙌 1
    ✅ 1
    a
    j
    • 3
    • 3
  • k

    Kun Yin

    07/26/2022, 9:13 AM
    Version:
    prefect==2.0b12
    Problem: when I run the code
    a, b= task.submit(c, d)
    the error occurs:`cannot unpack non-iterable PrefectFuture object` I can use
    task(nout=2)
    in Prefect 1.0+ when my task has multiple return values. What shall I do in Prefect 2.0+ in the same situation? Maybe
    a, b= task.submit(c, d).result()
    can work but I think it is not a good way.
    ✅ 1
    a
    • 2
    • 1
  • s

    Sana Shaikh

    07/26/2022, 8:53 PM
    Hello everyone. Its been an amazing experience with prefect. Right now I am trying Orchestration prefect with DBT. @Anna Geller You have done awesome work🙌. I am following your medium article and founding some blockage while registration of the flow on prefect cloud! Can you please help me I just get stuck on the below screens https://medium.com/the-prefect-blog/orchestrating-elt-with-prefect-and-dbt-a-flow-of-flows-part-1-aac77126473#26cb
    ✅ 1
    🙌 1
    o
    a
    • 3
    • 5
  • s

    Sana Shaikh

    07/26/2022, 8:58 PM
    Anyone else can help
    ✅ 1
  • s

    Sana Shaikh

    07/26/2022, 8:58 PM
    Highly appreciated
    ✅ 1
    o
    • 2
    • 2
  • j

    John Kang

    07/26/2022, 11:32 PM
    I'm running into an issue where when a flow encounters an error it looks like the asyncio library throws an error. Any idea why?
    ✅ 1
    a
    • 2
    • 10
  • d

    Darren

    07/27/2022, 6:08 PM
    I have been skimming through the documentation and haven't been able to find the answer to a question I have in regards to a local prefect agent. With that agent being on a local network behind a firewall, my flows can access local resources like a mysql database?
    ✅ 1
    a
    j
    • 3
    • 3
  • a

    Anna Geller

    07/28/2022, 4:05 AM
    has renamed the channel from "best-practices-orchestration" to "best-practices-coordination-plane"
    :cool-llama: 5
    :prefect: 6
  • m

    Matt from DataHouse

    07/29/2022, 2:36 PM
    Hey everyone, I'm getting an error on the Prefect Orion server on Windows:
    File "c:\ds\tools\python3.8\latest\lib\site-packages\httpx\_transports\default.py", line 77, in map_httpcore_exceptions
        raise mapped_exc(message) from exc
    httpx.ConnectTimeout
    Here's a link to a Github issue with the full stack trace. I tried various Prefect versions, most recently 2.0.0 but I get the same error. Is is an issue with asyncio? I think John Kang's message from yesterday suggests a similar problem. In this case, though, I can't even start the server.
    a
    • 2
    • 3
  • v

    Viet Nguyen

    08/01/2022, 2:57 AM
    Hi all, I'm trying to implement the first concept in the diagram, but got this error, the Lambda function was assigned to an almost admin permissions IAM role, I'm not sure how to deal with it (e.g
    Failed to create the Prefect home directory at /home/sbx_user1051/.prefect
    ) The full error below: I didn't execute any Prefect flow tho, just tried to import prefect package with the dummy Lambda handler function. We aim to do things with server-less infrastructure using AWS services. Our actual pipeline may involve processing very large amount of NetCDF files, but many occasions, will be just a few newly uploaded file. So my questions, how to overcome the above error? And is the second option doable? We would use Fargate cluster for Dask client, but the environment where Dask client is created needs to have sufficient memory ~10GB etc. Many thanks.
    a
    c
    c
    • 4
    • 15
  • e

    eddy davies

    08/01/2022, 2:24 PM
    Hey, I have just installed prefect using pip and pyenv but the Server Type is <Client Error>, can anyone help?
    ✅ 1
    m
    • 2
    • 10
  • j

    John Kang

    08/01/2022, 7:14 PM
    One note for those running remote deployments on Windows machines. It seems like whenever a flow is triggered on an agent the entire contents of the remote deployment is downloaded to the machine in userid/AppData/Local/Temp/somegibberishfoldernameprefect. FYI, because if you have large flows this could become large and you may want to trigger some sort of operation to clear out older prefect flow directories. Anyone have some best practices regarding triggering some sort of scheduled cleanup of folder directories on machines running Prefect flows?
    r
    • 2
    • 2
  • j

    John Kang

    08/01/2022, 9:40 PM
    Question, I am running a flow and have it saving a file to a relative path. When I check the relative path on my local filesystem the file is not updated when the flow executes from a Prefect agent. I tried it again with a remote filesystem and it does not update the files on the remote filesystem. I checked on the server and the blocks holding the credentials for both local and remote both state that the block has read and write privileges. Any idea on what I can do to actually have the file update the local (when the deployment is run using the local file system) and remote (when the deployment is run using the remote file system)?
    a
    k
    • 3
    • 10
  • m

    Maximilian Schnieder

    08/02/2022, 5:55 AM
    Hi, I was wondering if I could use prefect for object methods. My use case would be that instead of creating a thread that runs inside the object every 60 seconds, I could create a scheduled flow for the object. But is this even possible? And how would the deployment file look?
    👀 1
    ✅ 1
    b
    • 2
    • 5
  • j

    John Kang

    08/03/2022, 8:38 PM
    Encountered an issue with the Agent in a Windows terminal. Just an FYI (on 2.0.2).
    k
    j
    • 3
    • 5
  • p

    Pipat (Benz) Methavanitpong

    08/04/2022, 3:58 AM
    How do you people supplying parameters for a flow for different environments e.g. local, staging and prod? With CLI-based deployment, I have to modify deployment files locally or modify in web ui after applying. I think it's weird to drop the
    Deployment
    class. It can provide the deployment configuration above the
    DO NOT EDIT
    line in a generated deployment file. https://orion-docs.prefect.io/concepts/deployments/
    ➕ 4
    a
    m
    +2
    • 5
    • 17
  • j

    Jonas Dahlbæk

    08/04/2022, 10:59 AM
    Question: We are looking for a pipeline orchestrator and I have been testing prefect. I had a very good experience following the deployment docs, but one thing has me a bit confused; I was testing out the flow versioning feature by sending a TERM signal to the agent while it was running a flow, and it seems this leaves the flow in running state indefinitely (or pending, depending on when I interrupt the agent). I am following https://docs.prefect.io/concepts/work-queues/ for the agent setup, and what I see seems similar to the issue described here https://github.com/PrefectHQ/prefect/issues/2834, so I went looking for docker or kubernetes agents. However, I'm not finding anything for prefect v2; Am I doing something horribly wrong? What is the best practice with regards to handling hanging flows due to issues on the agent side?
    👀 1
    ✅ 1
    b
    • 2
    • 6
  • d

    Darren

    08/04/2022, 3:45 PM
    I'm still new to tools such as Prefect. I am trying to automate our onboarding process which is pulling(api/json) employee information from one source and create accounts into 3 other applications via api/json. My thought process would be creating a flow to pull the data and check to see if the accounts exists, if they don't create them. My question is: Would it be better to pass the data between the tasks or place the data into a source such as a file(csv maybe?) or database?
    ✅ 1
    j
    • 2
    • 7
  • k

    Karl Bühler

    08/06/2022, 10:30 PM
    What is best practice for api calls (requests session). My issue is: I would like to use the same session object for my scheduled flows. I tried just caching the session object, but this doesn't lead to the desired results. Any best practices here in how to handle sessions?
    a
    • 2
    • 1
  • а

    Андрей Насонов

    08/08/2022, 12:46 PM
    Hi! We're moving from Prefect to Prefect2.0. We would like to use our Gitlab repository as a storage for flows, as we used to with GitlabStorage in Prefect1.0. I'm trying to use RemoteFileSystem
    from prefect.filesystems import RemoteFileSystem
    gitlab_block = RemoteFileSystem(
            basepath='<git://path/to/repo>',
            settings={
            'key': "GITLAB_USER",
            'secret': "GITLAB_TOKEN"})
    gitlab_block.save('flows_repo')
    This leaves me with
    prefect.exceptions.PrefectHTTPStatusError: Client error '422 Unprocessable Entity' for url '<http://ephemeral-orion/api/block_documents/>'
    I might be misusing RemoteFileSystem something fierce, could you please guide me in the right direction? Big thanks!
    a
    a
    • 3
    • 3
  • c

    Chris L.

    08/08/2022, 3:07 PM
    Hello Prefecters. A question about Prefect 2.0 deployment. I have one
    scheduler
    flow that takes a
    subflow_key
    . This
    subflow_key
    is passed into a curried function that dynamically creates a new Prefect flow. There are about 30 (and growing) different subflows that can be created dynamically. My problem: I would like to have a single-flow many-deployments (each deployment is associated with different types of subflows) setup Constraints: Because subflows are generated dynamically using the curried function, I can't separate subflows into parent flows and run
    prefect deployment build
    for each flow. What I've tried: 30 different deployment yaml files for 1
    scheduler
    flow with 30 different combinations of
    subflow_keys
    and schedules. My question: Is there a DRYer way to achieve the same setup. Each deployment file is identical except for 2 lines (parameters and schedule)? What is Prefect engineering's current take on this single-flow many-deployments paradigm? Will this be achievable via a single Prefect CLI command in the future (maybe with arrays of parameters / schedule flags passed into
    prefect deployment build
    )?
    j
    m
    • 3
    • 4
  • а

    Андрей Насонов

    08/08/2022, 3:57 PM
    Hi! As I understand it, KubernetesJob exists on the level of infrastructure and allows for flow execution as a separate job. Is there a similar task runner for this (separate task - separate job), or should I stick to using subflows?
    j
    • 2
    • 1
  • n

    Neil Natarajan

    08/08/2022, 5:26 PM
    Hi! Does prefect support running Flows from within Prefect Tasks? What is the best syntax for running a flow of flows with the free offering
    a
    • 2
    • 4
  • z

    Zheyuan

    08/09/2022, 1:50 PM
    Hi Guys! I have two questions. Is there a full example of using dask‘s worker resources and prefect flows? If some such examples are provided to me, I can write some best practices. (Except for AWS etc. Because I don't know how to use it) Another question is, for dask clusters, can I use prefect to implement dynamic task submission while also ensuring task dependencies? And There seems to be unpredictable task loss when I try to use multiple flows with the same dask client. This may be my problem. By the way, the doc of Prefect 2 is maybe not as clear as Prefect 1. and there are so many concepts in Prefect 2 that I couldn't figure out quickly what I needed to understand. Hope to get help from everyone.
    r
    • 2
    • 2
  • z

    Zheyuan

    08/11/2022, 5:33 AM
    In the resource annotation, if my machine has 4 GPUs, how do I know which GPU is currently being used by the worker if I use a temporary cluster? Is the temporary worker automatically assigned GPUs? https://prefecthq.github.io/prefect-dask/
    a
    • 2
    • 6
Powered by Linen
Title
z

Zheyuan

08/11/2022, 5:33 AM
In the resource annotation, if my machine has 4 GPUs, how do I know which GPU is currently being used by the worker if I use a temporary cluster? Is the temporary worker automatically assigned GPUs? https://prefecthq.github.io/prefect-dask/
This is important for running external programs. We usually need to provide a GPU id to specify the GPU used by the program, and we need to figure out how the temporary cluster assigns the GPU to the task. Do we need to tell the task which GPU id to use?
a

Anna Geller

08/11/2022, 10:14 AM
Interesting. I wonder whether Dask is the right solution to your problem. I wonder whether you might be better off either spinning up separate agents on VMs with GPUs, or even easier using Kubernetes agent with labels indicating GPU to use. Both seems to be less of a blackbox in terms of allocation of your work to the right compute you need
If you're running on AWS, I can recommend Karpenter which makes it easy to provision the right GPUs for your runs
z

Zheyuan

08/11/2022, 11:37 AM
@Anna Geller Thanks Anna. Your suggestion has completely discouraged me from using temporary clusters. Using an existing cluster seems to be the only achievable solution.😭
a

Anna Geller

08/11/2022, 11:46 AM
Maybe not only but definitely the easiest. Karpenter can help you spin up GPU compute on Kubernetes for workflows that need it simply by assigning the right labels on your Prefect KubernetesJob block
And it can spin it down when they are no longer needed too
View count: 10