https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-community
  • d

    Deepanshu Aggarwal

    12/16/2022, 8:37 AM
    anyone getting this warning while running flows on AWS EKS ?
    /usr/local/lib/python3.9/runpy.py:127: RuntimeWarning: 'prefect.engine' found in sys.modules after import of package 'prefect', but prior to execution of 'prefect.engine'; this may result in unpredictable behaviour
      warn(RuntimeWarning(msg))
    ✅ 1
    r
    j
    11 replies · 3 participants
  • s

    Simon Macklin

    12/16/2022, 10:30 AM
    hey prefect we are testing v2 with kubernetes and the agent appears to be trying to list namespaces in kube-system we are using 2.7.1
    ✅ 1
  • s

    Simon Macklin

    12/16/2022, 10:30 AM
    any reason it requires to look at kube-system?
    ✅ 1
    v
    3 replies · 2 participants
  • j

    Jarvis Stubblefield

    12/16/2022, 3:59 PM
    I have another issue that may not be related to Prefect, but thought I would post here in the off chance someone has had this issue when using Prefect. I have a task that looks at about 30-50,000 records total if it looks at all of the records in the table. Transactions are to be committed in blocks of 1,000 records checked. For each 1k there are less than that being updated. Previously at times this task would take a few hours when it really had to go over 3-5,000 records. The first three days I started this task it took a while which was to be expected as I stopped manually running the task. However, it is now again taking a LONG time and I’m ending up with MySQL Lock Table Timeouts
    MySQLdb.OperationalError: (1205, 'Lock wait timeout exceeded; try restarting transaction')
    … any thoughts or ideas on what might be causing additional delay?
    6 replies · 1 participant
  • n

    Nils

    12/16/2022, 4:18 PM
    I'm trying to set
    persist_result
    to False since the object it return is not compatible with the Pickle serializer. However, I'm receiving the following error. I'm running on 2.7.2.
    Flow could not be retrieved from deployment.
    Traceback (most recent call last):
      File "<frozen importlib._bootstrap_external>", line 883, in exec_module
      File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
      File "/opt/prefect/main.py", line 6, in <module>
        from steps.parse.parser import parse
      File "/opt/prefect/steps/parse/parser.py", line 9, in <module>
        from .html_parser import process_html
      File "/opt/prefect/steps/parse/html_parser.py", line 22, in <module>
        @task(persist_result=False)
    TypeError: task() got an unexpected keyword argument 'persist_result'
    m
    2 replies · 2 participants
  • d

    Devansh Doshi

    12/16/2022, 6:00 PM
    Hi all, these might be a few questions that get asked a lot but I am just starting to contribute to open-source projects so please help me out. Are there any best practices or conventions I need to follow if I want to contribute to this project? A few questions that I have are: What is the branch naming convention? Is there any commit message convention like adding issue-id: commit-msg? From which branch should create a new branch for the issue I will be working on and which branch should I raise a PR against? I understand that the main branch is unstable so probably not the best idea to branch from there. Feel free to add any other important conventions that I need to be aware of.
    👀 1
    ✅ 1
    m
    j
    9 replies · 3 participants
  • z

    zlee

    12/16/2022, 6:07 PM
    Hi all -- I'm self-hosting prefect via kubernetes (and using aws RDS for db) and I realized my db has been steadily running out of space since we have a lot of regularly running flows. Is there something I can set like a retention period so that flow run data from older than x days is deleted automatically?
    c
    2 replies · 2 participants
  • d

    Deepanshu Aggarwal

    12/16/2022, 6:21 PM
    hi all. i have one question for people running prefect orion on their self hosted instances.. do you guys get a lot of 502 errors? any possible way to avoid these ? adding retries in tasks and flow does help but it would be great if they could be avoided all together. thank you
    c
    5 replies · 2 participants
  • a

    Andrew

    12/16/2022, 6:43 PM
    Suggestion to add
    .git
    to default .prefectignore
    ✅ 1
    👍 1
    m
    2 replies · 2 participants
  • s

    Sean Davis

    12/16/2022, 7:32 PM
    Does anyone have a good example of using trio/asyncio directly in a task/flow? I have a basic 4-function script that uses trio memory channels. Specifically, it uses a producer/consumer with send/receive channels to achieve 40-fold concurrency for API scraping. When I try various approaches to convert to flows/tasks either hangs after completion or complains about lacking an event loop. I've tried a number of combinations and haven't found anything, including just a flow function that calls the other functions.
    m
    2 replies · 2 participants
  • p

    Patrick Tan

    12/16/2022, 10:04 PM
    Hi, I am calling a function twice in a sequential flow, each call is using different input parameters. Seems like first sets of input parameters are cache and are used for 2nd calling of the function. How can I disable caching of input parameters?
    ✅ 1
    m
    2 replies · 2 participants
  • l

    lialzm

    12/17/2022, 9:51 AM
    hi team how to cancel host label when register flow? I didn't find the answer in the docs
    1️⃣ 1
    ✅ 1
    a
    1 reply · 2 participants
  • t

    Tim-Oliver

    12/17/2022, 3:28 PM
    Hi, I am having trouble using the map functionality. When I use map not all tasks are being scheduled and after completing the scheduled tasks the flow hangs in a running state.
    a
    12 replies · 2 participants
  • f

    Florian Kühnlenz

    12/17/2022, 7:53 PM
    We are seeing quite a lot intermittent of API errors and consequent flow run failures in cloud v1. Anyone else?
    a
    14 replies · 2 participants
  • f

    Fady Khallaf

    12/18/2022, 11:20 AM
    Hello, is prefect compatible with arm architecture? if anyone tried prefect on arm arch do you recommend it for production level
    p
    4 replies · 2 participants
  • a

    Andrei Tulbure

    12/18/2022, 2:14 PM
    Hi. I have a quick question: We recently migrated from Prefect1 to Prefect2 and mostly everything went smooth. A single problem I have: we had some Flows running easily in parallel on ECS on Prefect1 and right now, if we try to run them in parallel on ECS in Prefect 2 they are not launching in parallel. We do not use Dask, just simply launch them with the task call in a for loop. Could you provide us any resources where we could find how to launch them in parallel on ECS (yes, we have enough of a vCPU quota on AWS)
    m
    p
    +1
    8 replies · 4 participants
  • a

    Agnieszka

    12/18/2022, 3:19 PM
    Hello, What would be the best way to run external docker containers as prefect tasks? I have a pipeline/flow that is supposed to run a few "external" docker images on some data (sequentially, where one's output (or part of it) is used as input for the next task). The docker images contain python scripts/modules that are not integrated with prefect. I would like to be able to pull those images from the private registry, and run containers based on images with custom parameters (based on settings defined by the user , that I pass to my flows). The pipeline is meant to be used by users with no prefect knowledge who can just git pull, install everything with pip and run it providing some custom parameters. That's why I would prefer not to go for orion infrastructure blocks and deployment but rather hide all prefect magic under python CLI. The naive approach I was thinking about is to have a custom docker executor class/module (running containers) that methods will be called by prefect tasks, but maybe there is a better way to make it work? I am completely new to prefect, so any advice is much appreciated.
    ✅ 1
    :docker_ship: 1
    a
    7 replies · 2 participants
  • k

    Kelvin DeCosta

    12/19/2022, 8:54 AM
    Quick question: Is there a way to programmatically create / update a deployment, say via
    .build_from_flow
    , but keep it disabled / turned off, similar to how we can turn them off via the UI?
    ✅ 1
    r
    2 replies · 2 participants
  • t

    Tobias

    12/19/2022, 10:44 AM
    Has anyone experienced they can’t delete a flow/deployment? 🧵*→*
    ✅ 1
    a
    16 replies · 2 participants
  • n

    Nic

    12/19/2022, 10:45 AM
    Has anybody had success with running prefect agent locally as a background process on a windows machine like you can do with supervisor on a linux system? I've found this thread https://discourse.prefect.io/t/how-to-run-your-prefect-agent-as-a-background-process-with-supervisord/1551. but seems like supervisor-win might be broken currently. How else could i go about this?
    ✅ 1
    a
    r
    +1
    10 replies · 4 participants
  • p

    Patrick Tan

    12/19/2022, 2:44 PM
    Hi, we have been receiving prefect.agent - Invalid input issue. ANy experience with this?
    ✅ 1
    b
    4 replies · 2 participants
  • j

    James Zhang

    12/19/2022, 4:12 PM
    hey guys, has anyone seen this kind of strange Numba error before? I was trying to train a Top2Vec model, which uses Umap and Numba at the lower level, what strange is, starting from certain amount of training data, the training throws this error
    ...
    File "/opt/prefect/workflow/tasks/model_train.py", line 64, in train_model
        model = Top2Vec(
      File "/opt/conda/envs/prefect/lib/python3.9/site-packages/top2vec/Top2Vec.py", line 668, in __init__
        umap_model = umap.UMAP(**umap_args).fit(self.document_vectors)
      File "/opt/conda/envs/prefect/lib/python3.9/site-packages/umap/umap_.py", line 2516, in fit
        ) = nearest_neighbors(
      File "/opt/conda/envs/prefect/lib/python3.9/site-packages/umap/umap_.py", line 328, in nearest_neighbors
        knn_search_index = NNDescent(
      File "/opt/conda/envs/prefect/lib/python3.9/site-packages/pynndescent/pynndescent_.py", line 920, in __init__
        self._neighbor_graph = nn_descent(
      File "/opt/conda/envs/prefect/lib/python3.9/site-packages/numba/core/dispatcher.py", line 468, in _compile_for_args
        error_rewrite(e, 'typing')
      File "/opt/conda/envs/prefect/lib/python3.9/site-packages/numba/core/dispatcher.py", line 409, in error_rewrite
        raise e.with_traceback(None)
    numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
    [1m[1m[1m[1mFailed in nopython mode pipeline (step: nopython frontend)
    [1mUntyped global name 'print':[0m [1m[1mCannot determine Numba type of <class 'function'>[0m
    [1m
    File "../conda/envs/prefect/lib/python3.9/site-packages/pynndescent/pynndescent_.py", line 252:[0m
    [1mdef nn_descent_internal_low_memory_parallel(
        <source elided>
            if verbose:
    [1m            print("\t", n + 1, " / ", n_iters)
    [0m            [1m^[0m[0m
    [0m
    [0m[1mDuring: resolving callee type: type(CPUDispatcher(<function nn_descent_internal_low_memory_parallel at 0x7fe7ea5adee0>))[0m
    [0m[1mDuring: typing of call at /opt/conda/envs/prefect/lib/python3.9/site-packages/pynndescent/pynndescent_.py (358)
    [0m
    [0m[1mDuring: resolving callee type: type(CPUDispatcher(<function nn_descent_internal_low_memory_parallel at 0x7fe7ea5adee0>))[0m
    [0m[1mDuring: typing of call at /opt/conda/envs/prefect/lib/python3.9/site-packages/pynndescent/pynndescent_.py (358)
    [0m
    [1m
    File "../conda/envs/prefect/lib/python3.9/site-packages/pynndescent/pynndescent_.py", line 358:[0m
    [1mdef nn_descent(
        <source elided>
        if low_memory:
    [1m        nn_descent_internal_low_memory_parallel(
    [0m        [1m^[0m[0m
    The whole thing runs perfectly fine on my local, the python environment is also the same, but on the Prefect KubernetesJob it has the problem… I’m sure I’ve given the job enough resources (CPU & RAM)… anything to do with the parallelism? I don’t know much about the Numba
    CPUDispatcher
    , can it be it’s not supported in a Prefect Task? being stuck on this for days… 😣
    m
    14 replies · 2 participants
  • m

    Michał Augoff

    12/19/2022, 7:36 PM
    hey, is it possible to have dynamic default date parameters in deployed flows? I know date parameters are rendered in the UI with a calendar, and I know you can use Optional[date] and dynamically assign e.g. today’s date inside the flow code. But I’m wondering if there’s a way to say “today is default” and have that reflected in the UI when I create a new run (instead of having None in the UI)
    ✅ 1
    m
    k
    +1
    13 replies · 4 participants
  • g

    Greg Ott

    12/19/2022, 8:13 PM
    Hello all! Is there support/a timeline for support for Delta Live Table pipelines in Azure Databricks? I checked the prefect-databricks package, but it appears that the supported API calls are mainly for kicking off traditional databricks jobs, not DLT pipelines.
    ✅ 1
    r
    2 replies · 2 participants
  • e

    Edmund Tian

    12/19/2022, 9:00 PM
    Is it possible to trigger multiple concurrent Flow Runs for the same Flow? I’m getting this error when I attempt to do so:
    RuntimeError: The task runner is already started!
    Context: I have an application hosted on GCP Cloud Run that allows multiple requests per container. Whenever my application receives an API request with a
    user_id
    param, it triggers a flow run to update data for that user. When I trigger the first Flow Run via an API request, everything works fine. But if I call the API again while the first Run is ongoing, my second API request with throw the above RuntimeError. More context: My application is written in Flask and served with Gunicorn. It’s deployed on GCP Cloud Run. Here’s the gunicorn config in my dockerfile
    CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 pelm_utility_service:app
    ✅ 1
    r
    4 replies · 2 participants
  • j

    Jason Noxon

    12/19/2022, 9:45 PM
    Hi, all! I installed Prefect 2 (no issues) and now I am trying to run
    prefect cloud login
    I enter my API key, and I get an Index out of Range exception. Has anyone seen this?
    ✅ 1
  • j

    Jason Noxon

    12/19/2022, 9:54 PM
    Traceback (most recent call last):
      File "/usr/local/lib/python3.8/dist-packages/prefect/cli/_utilities.py", line 41, in wrapper
        return fn(*args, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/prefect/utilities/asyncutils.py", line 205, in coroutine_wrapper
        return run_async_in_new_loop(async_fn, *args, **kwargs)
      File "/usr/local/lib/python3.8/dist-packages/prefect/utilities/asyncutils.py", line 156, in run_async_in_new_loop
        return anyio.run(partial(__fn, *args, **kwargs))
      File "/usr/local/lib/python3.8/dist-packages/anyio/_core/_eventloop.py", line 70, in run
        return asynclib.run(func, *args, **backend_options)
      File "/usr/local/lib/python3.8/dist-packages/anyio/_backends/_asyncio.py", line 292, in run
        return native_run(wrapper(), debug=debug)
      File "/usr/lib/python3.8/asyncio/runners.py", line 44, in run
        return loop.run_until_complete(main)
      File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
        return future.result()
      File "/usr/local/lib/python3.8/dist-packages/anyio/_backends/_asyncio.py", line 287, in wrapper
        return await func(*args)
      File "/usr/local/lib/python3.8/dist-packages/prefect/cli/cloud.py", line 429, in login
        workspace = current_workspace or workspaces[0]
    IndexError: list index out of range
    ✅ 1
    m
    r
    10 replies · 3 participants
  • p

    Praveen Shilavantar

    12/19/2022, 10:03 PM
    Hi, is it posisble to add @flow to existing class methods?
    class API:
    @flow
    def start(self):
    pass
    I am getting TypeError: missing a required argument: 'self' (edited)
    ✅ 1
    m
    11 replies · 2 participants
  • p

    Puneetjindal 11

    12/20/2022, 9:57 AM
    how to start a flow run for deployment B from a flow run of deployment A and what is the rate limits there
    r
    5 replies · 2 participants
  • s

    Slackbot

    12/20/2022, 12:35 PM
    This message was deleted.
Powered by Linen
Title
s

Slackbot

12/20/2022, 12:35 PM
This message was deleted.
View count: 3