https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-community
  • f

    FuETL

    04/27/2022, 8:14 PM
    Hey guys, i have a task that can take up to 20m to run (and also have a retry decorator), i started getting
    Marked Failed by a Zombie Killer process
    how can i increase this tolerante and why this happening? My task only take some time but this not means that is failed
    k
    • 2
    • 1
  • d

    David Haynes

    04/27/2022, 8:20 PM
    Hey folks. If I register a flow, I get an identifier back. How do I use that identifier to get the flow state information back from the service? I am probably just looking in the wrong place, but I can't find it in the docs.
    k
    a
    • 3
    • 14
  • m

    Mars

    04/27/2022, 8:47 PM
    Hi all, what would you consider best practice for reporting business-case-specific results from a Prefect pipeline, such as a data cleansing task or data quality checkpoint task? Would you report the results directly to the prefect logs, or is it better to keep the prefect logging to task summaries and output detailed reports as a text or html file?
    k
    • 2
    • 6
  • m

    Matt Alhonte

    04/28/2022, 1:34 AM
    What does it mean when a Task has the
    Running
    state but also has an
    End Time
    and the
    Duration
    isn't ticking up? Example:
    k
    • 2
    • 6
  • j

    Jonathan Mathews

    04/28/2022, 8:39 AM
    Hi, to use Gitlab storage, do I have to provide my personal access token, or can I use a repository specific deploy token? As far as I can see, a personal access token gives read/write access to all of my repositories. Perhaps a project-level access token would work, but I see the docs refer to a personal access token: https://docs.prefect.io/orchestration/flow_config/storage.html#gitlab
    k
    • 2
    • 5
  • j

    James Phoenix

    04/28/2022, 10:12 AM
    We are mapping over a series of requests, to make multiple tasks within the same flow, • We want to set the task state to be failed/pending for each task, after it has run. • Then later we want to update the task state to be succeeded via the graphQL/rest API • We are using Prefect cloud as the API back_end. ---- • Currently what python methods/GraphQL should we use to update the state of individual task runs inside of flow runs?
    a
    • 2
    • 11
  • j

    James Phoenix

    04/28/2022, 10:47 AM
    Or how to make a task that runs after a new blob arrives in a google cloud storage bucket?
    a
    • 2
    • 2
  • d

    David Evans

    04/28/2022, 10:50 AM
    Hi, I'm looking to set up some Prefect-agents-as-a-service within my organisation. We're planning to use the hosted Prefect Cloud with some AWS ECS agents, and a GitHub Actions pipeline to publish new flows (probably using S3 as storage). This is all greenfield so we're pretty flexible on the details. Right now we're sticking to Prefect 1.x but if there's a good reason to jump to 2.x I don't think that would be a problem (as long as it's safe/secure we don't mind the occasional hiccough while it's still in beta) Most of that is fine; so far we're just using a local agent deployed on EC2, but I can see there's an ECS agent which we can presumably use easily enough to get any scalability that we'll need, and we can use the
    prefect
    CLI to push flows from GitHub Actions. But where we're hitting problems is with dependency management (both internal code which is shared between multiple tasks/flows, and external dependencies). From what I've seen, Prefect doesn't really support this at all (flows are expected to be self-contained single files), with the implication being that the agent itself has to have any shared dependencies pre-installed (which in our case would mean that any significant changes require re-building and re-deploying the agent image - a slow process and not very practical if we have long-lived tasks or multiple people testing different flows at the same time). I tried looking around for Python bundlers and found stickytape, but that seems a bit too rough-and-ready for any real use. This seems to be a bit of a known problem: 1, 2 and specifically I see:
    V2 supports virtual and conda environment specification per flow run which should help some
    And I found some documentation for this (which seems to tie it to the new concept of deployments), but I'm still a bit confused on the details: • would the idea be to create a deployment for every version of every flow we push? Will we need to somehow tidy up the old deployments ourselves? • can deployments be given other internal files (i.e. common internal code), or is it limited to just external dependencies? Relatedly, do deployments live on the server or in the configured Storage? • is there any way to use zipapp bundles? • ideally we want engineers to be able to run flows in 3 ways: entirely locally; on a remote runner triggered from their local machine (with local code, including their latest local dependencies); and entirely remotely (pushed to the cloud server via an automated pipeline and triggered or scheduled - basically "push to production") — I'm not clear on how I should be thinking about deployments vs flows to make these 3 options a reality. I also wonder if I'm going down a complete rabbit hole and there is an easier way to do all of this?
    a
    • 2
    • 9
  • d

    David Evans

    04/28/2022, 11:58 AM
    🧵 for dependency questions from above: Ideally we're looking to keep our flows minimal, so virtual environments would be preferable over full docker images, but that's not a strict requirement. If docker images will get us what we need, we can work with that. Presumably we'd have to create a docker image (based on
    prefecthq:prefect
    ? or would it just need
    python:3
    ?) for each flow? And I guess these docker images would run
    pip install -r requirements.txt
    as a build layer. But if we can achieve this with a virtual environment instead I think that would be preferable (I'm thinking in terms of the flow needed for an engineer to try something out by pushing it to the runner from their local machine) (I can see the high-level concept here but I'm struggling to see how it will look in practice for the various use-cases)
    a
    • 2
    • 11
  • d

    davzucky

    04/28/2022, 11:58 AM
    Did active try to use Yugabyte instead of postgres with Prefect Server 1.x? Looking to get more resiliency and hot hot availability.
    a
    • 2
    • 8
  • b

    Baris Cekic

    04/28/2022, 12:49 PM
    hey here , what is the best practice to generate flows dynamically on runtime and then register to server ? We are using in-house platform for designing ETL logic but after that to orchestrate the tasks in it we want to programatically generate the flows.
    a
    • 2
    • 10
  • c

    Chris Reuter

    04/28/2022, 1:02 PM
    Join us in the Cantina today at 3p Eastern! https://prefect-community.slack.com/archives/C036FRC4KMW/p1651150914469329
    🔥 3
  • x

    xyzz

    04/28/2022, 1:13 PM
    If you set up a storage with cloud connected, where are the secrets stored (e.g. AWS secret access key)? on the local machine or on cloud?
    a
    k
    k
    • 4
    • 6
  • a

    Amruth VVKP

    04/28/2022, 2:11 PM
    Hi Community, I need a quick help in utilizing Prefect Orion's API from an existing Python application, I looked into the documentation to use the Prefect Orion Python Client but somehow I am having trouble using it. My existing stack has an FastAPI server running for a given application and I need to put in a feature that allows external users to use my existing API to run any Prefect Orion flows (my existing API could act as a proxy for the time being). Any suggestions?
    a
    • 2
    • 3
  • j

    Joshua Greenhalgh

    04/28/2022, 2:12 PM
    Is there a way to have a schedule on a flow but for it to be turned off by default?
    k
    a
    • 3
    • 6
  • d

    David Evans

    04/28/2022, 2:35 PM
    Hey: we're trying out Prefect Cloud 2.0 but we don't seem to have any way to get 2 users access to the same workspace; is this possible yet?
    a
    • 2
    • 1
  • b

    Bob Colner

    04/28/2022, 3:04 PM
    Orion question, are task run concurrency limits supported?
    a
    k
    • 3
    • 3
  • p

    Philip MacMenamin

    04/28/2022, 3:36 PM
    from prefect import task, Flow
    from typing import Tuple
    @task
    def double_and_triple(x: int) -> Tuple[int, int]:
    return x * 2, x * 3
    with Flow("This works") as flow:
    a = [1,2,3]
    double, triple = double_and_triple.map(x=a)
    a
    k
    • 3
    • 6
  • g

    Geoffrey Keating

    04/28/2022, 3:48 PM
    I'm getting familiar with prefect 2.0b3 and I've run into a problem with using classes that own a python logger. I get the following error:
    ValueError: [TypeError("'_thread.RLock' object is not iterable"), TypeError('vars() argument must have __dict__ attribute')]
    Code to reproduce in thread Prefect 1.3 didn't seem to care about loggers being a part of a class used in a flow - any patterns worth adopting to replace this or does this merit a fix?
    a
    k
    • 3
    • 6
  • c

    Chris Reuter

    04/28/2022, 5:19 PM
    👋 hey everyone! Prefect is hosting a happy hour at PyCon US tomorrow night starting at 5:30p. Will we see you there? https://prefect-community.slack.com/archives/C036FRC4KMW/p1651166311560819
    🥂 2
    🔥 1
    🍻 3
  • c

    Chris Reuter

    04/28/2022, 6:45 PM
    Come hang in the Cantina starting in 15 minutes! https://prefect-community.slack.com/archives/C036FRC4KMW/p1651150914469329
    • 1
    • 1
  • a

    Alex Rogozhnikov

    04/28/2022, 6:52 PM
    Hi, I'm testing deployment to prefect's ECSCluster. Can you please elaborate how I can control types of worker instances? And is there a way to change used instance type for every flow individually? Thanks!
    There is also support in ``ECSCluster`` for GPU aware Dask clusters. To do
        this you need to create an ECS cluster with GPU capable instances (from the
        ``g3``, ``p3`` or ``p3dn`` families) and specify the number of GPUs each worker task
        should have.
    k
    • 2
    • 2
  • a

    Amruth VVKP

    04/28/2022, 7:46 PM
    Hello community, This is an Orion UI question - I've got a weird issue with the UI renders on my firm's systems but the UI seems to work just fine on my personal computer. Here's how they are configured - My Personal Computer - Python 3.8.10 running with WSL2 on Ubuntu Distro, SQLite3 running locally on WSL2 and the Prefect storage is the local file system. My firm's deployment - Python 3.7.5 running on custom RHEL 7, SQLite3's DB running from a network drive, the project source code is executed from a different network drive and the Prefect storage is mapped to a different network drive (essentially these 3 are on 3 different network drives). Somehow the Orion's web UI renders is terrible, it misses out on multiple elements, the interactions on the page are terribly slow and pretty much unusable. Any idea on what could I do to make sure the web UI is usable (Orion's web UI is pretty fantastic for my use case)
    :discourse: 1
    m
    • 2
    • 3
  • t

    Tom Manterfield

    04/28/2022, 8:41 PM
    Hello! I’ve got a load of
    invalid duration format
    errors showing up in my Orion API, just checking if this is a bug or misconfig on my part?
    a
    m
    • 3
    • 58
  • g

    Greg Wyne

    04/28/2022, 10:19 PM
    Hi all! Not sure where to ask this question so crossposting this here: https://prefect-community.slack.com/archives/C014Z8DPDSR/p1651184226881699 Having trouble getting a new server running in our cluster and I’m wondering where I went wrong, thanks!
    ✅ 1
    a
    • 2
    • 1
  • m

    Matthew Roeschke

    04/28/2022, 10:21 PM
    I have a task in a functional API flow that has one task that calls
    map
    . I added
    max_retries
    to this task I got this Userwarning I don’t really know how to addressed based on the link. I thought I could pass the results from a functional task to another task?
    UserWarning: Task <...> has retry settings but some upstream dependencies do not have result types. See <https://docs.prefect.io/core/concepts/results.html> for more details.
    a
    • 2
    • 3
  • i

    Izu

    04/29/2022, 10:19 AM
    Hey Guys, I’m just getting started with prefect and I have a flow that uses custom modules. These modules are in the same directory as my flow script, and I do several
    from my_module import my_function
    within the flow script. Now here’s the thing; I have registered the flow and started my local agent. When I try to trigger the job to run from the prefect UI, I get the message:
    Failed to load and execute flow run: FlowStorageError('An error occurred while unpickling the flow:\n ModuleNotFoundError("No module named \'extract_strings\'")\nThis may be due to a missing Python module in your current environment. Please ensure you have all required flow dependencies installed.')
    The `extract_strings`function is defined in another module in the same director. Can anyone help?
    a
    • 2
    • 1
  • f

    Florian Guily

    04/29/2022, 10:26 AM
    Hey, is there a folder where logs of a flow trigerred manually from terminal are saved ?
    i
    e
    a
    • 4
    • 10
  • v

    Vivek Kaushal

    04/29/2022, 1:19 PM
    Hey folks, I was hoping to understand how Prefect Cloud communicates with an agent that’s inside a VPC. I setup a local agent on an EC2 instance and was able to connect with Prefect Cloud and use the instance as an agent. Can someone point me to the right resources for this? Thank you!
    a
    • 2
    • 3
  • a

    Amruth VVKP

    04/29/2022, 1:29 PM
    Hello community, I've got a quick question - what would be the best way to add a custom log handler to existing Orion or Prefect logs (I am on Prefect 2.0b3)?
    g
    a
    • 3
    • 6
Powered by Linen
Title
a

Amruth VVKP

04/29/2022, 1:29 PM
Hello community, I've got a quick question - what would be the best way to add a custom log handler to existing Orion or Prefect logs (I am on Prefect 2.0b3)?
g

Geoffrey Keating

04/29/2022, 1:34 PM
I asked something similar the other day - maybe this blog post could help? https://www.prefect.io/blog/logs-the-prefect-way/
:upvote: 1
a

Anna Geller

04/29/2022, 1:35 PM
Could you explain your use case? Do you want to attach a custom logger? if so, you could do that by adding an extra logger
prefect config set PREFECT_LOGGING_EXTRA_LOGGERS=scipy
You may also adjust the log level used by specific Orion log handlers. E.g., you could set
PREFECT_LOGGING_HANDLERS_ORION_LEVEL=ERROR
to have only
ERROR
logs reported to Orion. The console handlers will still default to level
INFO
.
a

Amruth VVKP

04/29/2022, 1:40 PM
My usual configuration is to use a 3rd part logger like loguru, use it's sink to dump python native log handler's logs and attach a custom handler to redirect these logs into an external API. If I need to do a similar setup for Prefect Orion's logs - I was hoping I would be able to either attach the handler or dump the logs into Loguru's sink. Any suggestions?
a

Anna Geller

04/29/2022, 2:12 PM
Good question - loguru is super specific - you would need to add Prefect’s log handlers as loguru
sink
. This Discourse topic is for 1.0, but maybe it gives you some ideas on how to approach it in 2.0
a

Amruth VVKP

05/03/2022, 8:42 AM
Thanks @Anna Geller, this is helpful. let me give it a shot.
👍 1
I
View count: 5