https://prefect.io logo
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-community
  • c

    Charles Liu

    03/22/2021, 9:53 PM
    Just want to clarify something about Cloud concurrency: even if you have multiple agents, the ability to run tasks is still hard limited by the concurrency limit correct?
    k
    • 2
    • 7
  • a

    Alexandru Sicoe

    03/22/2021, 11:11 PM
    Hello everyone, We're currently evaluating Prefect as our workflow scheduling solution. So far we love the simple API and architecture, especially with Prefect Cloud, which makes it very easy to get going. We do have a question however: What is the best practice for project structure, packaging and CI/CD with Prefect Cloud and Kubernetes Agent? We have multiple jobs across multiple github repos. These are generally various modules in various packages. These packages are mostly added to custom docker images tailored for other systems where we deploy. They also have complex dependencies from other public and private PyPi repos. Thanks, Alex P.S. Moved further details in the thread
    c
    k
    • 3
    • 5
  • t

    Tsang Yong

    03/23/2021, 12:19 AM
    Hi Team, I'm trying to access result of a failed task with the following code.
    cluster = KubeCluster.from_yaml(dask_worker_spec_file_path)
        cluster.adapt(minimum=1, maximum=10)
        executor = DaskExecutor(cluster.scheduler_address)
        state = flow.run(executor=executor)
    but when I try to access the state I'm getting this.
    Python 3.8.6 (default, Dec 11 2020, 14:38:29)
    Type 'copyright', 'credits' or 'license' for more information
    IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.
    
    In [1]: state
    Out[1]: <Failed: "Unexpected error: TypeError('Could not serialize object of type Failed.\nTraceback (most recent call last):\n  File "/usr/local/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 49, in dumps\n    result = pickle.dumps(x, **dump_kwargs)\nTypeError: cannot pickle \'_thread.RLock\' object\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File "/usr/local/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 307, in serialize\n    header, frames = dumps(x, context=context) if wants_context else dumps(x)\n  File "/usr/local/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 58, in pickle_dumps\n    frames[0] = pickle.dumps(\n  File "/usr/local/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 60, in dumps\n    result = cloudpickle.dumps(x, **dump_kwargs)\n  File "/usr/local/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 73, in dumps\n    cp.dump(obj)\n  File "/usr/local/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 563, in dump\n    return Pickler.dump(self, obj)\nTypeError: cannot pickle \'_thread.RLock\' object\n')">
    any idea what I'm doing wrong?
    c
    • 2
    • 5
  • m

    Mahesh

    03/23/2021, 1:04 PM
    Hi, Iam new to prefect, Trying to run snowflake query with prefect, PFB
    import prefect
    from prefect.tasks.snowflake.snowflake import SnowflakeQuery
    from prefect import task, Flow
    
    query = """
        SHOW DATABASES;
    """
    
    snowflake_def = SnowflakeQuery(
        account="account",
        user="user",
        password="****",
        database="***",
        warehouse="****",
        role="***",
        query=query
    )
    
    with Flow("hello-snowflake") as flow:
        snowflake_def()
    
    flow.register(project_name="tutorial")
    flow.run()
    when i trigger quick run from UI, Iam facing below issue
    Unexpected error: TypeError("cannot pickle '_thread.lock' object")
    Traceback (most recent call last):
      File "/opt/prefect_env/lib/python3.8/site-packages/prefect/engine/runner.py", line 48, in inner
        new_state = method(self, state, *args, **kwargs)
      File "/opt/prefect_env/lib/python3.8/site-packages/prefect/engine/task_runner.py", line 900, in get_task_run_state
        result = self.result.write(value, **formatting_kwargs)
      File "/opt/prefect_env/lib/python3.8/site-packages/prefect/engine/results/local_result.py", line 116, in write
        value = self.serializer.serialize(new.value)
      File "/opt/prefect_env/lib/python3.8/site-packages/prefect/engine/serializers.py", line 73, in serialize
        return cloudpickle.dumps(value)
      File "/opt/prefect_env/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 72, in dumps
        cp.dump(obj)
      File "/opt/prefect_env/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 540, in dump
        return Pickler.dump(self, obj)
    TypeError: cannot pickle '_thread.lock' object
    I made Checkpoint as FALSE
    j
    • 2
    • 2
  • i

    Igor Bondartsov

    03/23/2021, 2:45 PM
    Hi Team, Maybe someone knows how to make a custom run name for the flow run. For example, I don't want to see a random run name (red-octopus) , I want see <flow_name>_<datetime>.
    j
    s
    +2
    • 5
    • 16
  • e

    emre

    03/23/2021, 3:01 PM
    Hey everyone, just out of curiosity, is there something like an
    HttpGetTask
    (or another task for the HTTP request family)? Lately, I find myself
    GET
    ting a lot of results from random endpoints, and thought it could save boilerplate code on my end.
    m
    • 2
    • 3
  • j

    Javier Domingo Cansino

    03/23/2021, 5:25 PM
    o/ does anyone know about any document that contains information on the overall implementation? I'm trying to understand how to deploy prefect in my k8s cluster, and from what I understand, dask is a requirement to run kubernetes, but I don't understand what's the relationship there. Does a prefect agent create a dask cluster, consisting only of itself and then run tasks on it? Or am I supposed to create a dask cluster, with all dependencies installed, and then connect the agent to it?
    m
    j
    +2
    • 5
    • 26
  • j

    Jonathan Wright

    03/23/2021, 5:47 PM
    I’ve noticed that Prefect’s Bitbucket storage class is “for Bitbucket Server only”. Has anyone written or attempted to write one for Bitbucket Cloud? https://docs.prefect.io/api/latest/storage.html#bitbucket
    j
    • 2
    • 2
  • g

    Gleb Erokhin

    03/23/2021, 7:15 PM
    Hi! I’m very new to Prefect. I’m trying to run some provisioning workflows using Prefect. Our application runs in Kubernetes environment. What is the best pattern to check if Prefect workflow app is alive? Are there any hooks for that? For example, if one of the database connection dropped, I want Kubernetes to know about that by failing Health check. Deploying k8s agent looks like overkill, UI will not be used. (basically, only core). Is there a simple way to check if app is alive? Appreciate any help/advice !
    n
    • 2
    • 2
  • d

    David Elliott

    03/23/2021, 8:09 PM
    Hey all! I'm hitting an error which I think might be related to scaling the number of static tasks in my flow, wanted to get your thoughts? I have the following setup: • Prefect Cloud Server, Kubernetes Agent (k8s run config), DaskExecutor(make_cluster) -> spawns 1 dask worker, 4cpu, 12threads • Prefect v0.14.8, Docker storage in ECR I can run a flow with 196 tasks no problem (this is ~15% of our whole ETL). The UI even loads the schematic, and the flow runs to completion. All the tasks are doing is running queries on Snowflake - no data manipulation/results handling, just issuing SQL queries. When I generate the flow file with all 1192 tasks in it, I'm getting the
    400 Client Error:
    ...
    "input.states[0].task_run_id"; Expected non-nullable type UUID! not to be null.
    on some of the tasks when I run the flow. I'll put the full stack trace in the 🧵. It's happening on maybe 1 in every 20 tasks or so. The task then gets put into state 'ClientFailed' (and the UI can't see them) and all downstream dependents of these tasks then get set to state 'Pending'. I've tried many dask workers, then just 1 dask worker (for simplicity), same issue. Can't replicate it with the smaller (196 task) flow. I'm wondering if there's some kind of rate limiting going on whereby there are so many concurrent tasks running simultaneously (there are a tonne all trying to be ran at the same time) that some of them are getting a generic error from cloud or something? I would try adding a task concurrency limit to see if this helps with the above hypothesis, but the UI says it's not included in our plan (even though we're an enterprise tenant). Is it possible to set task concurrency at the flow level? Also, the UI can't load the schematic of the big flow, though that's less of an immediate concern. Thanks in advance for any advice!
    👀 1
    j
    • 2
    • 4
  • k

    Kelly Huang

    03/23/2021, 9:11 PM
    Hey! How can I keep an ECS Agent running forever without timing out on Fargate without an EC2 instance??
    m
    s
    • 3
    • 6
  • j

    Jillian Kozyra

    03/23/2021, 11:49 PM
    hey, i’m trying to enable mypy on our prefect project. what’s the best way to handle imports? if we do
    from prefect import Flow, Parameter, context, task, unmapped
    , mypy complains but flows work if we do
    from prefect.src import Flow, Parameter, context, task, unmapped
    , mypy is happy but python complains:
    ModuleNotFoundError: No module named 'prefect.src
    m
    • 2
    • 2
  • r

    Reece Hart

    03/24/2021, 4:05 AM
    Anyone out there used Prefect for genomic sequencing pipelines?
  • m

    Michael Wedekindt

    03/24/2021, 8:26 AM
    Hi folks, I try to start my very first agent to play with the Prefect Cloud UI but when I try to start an agent with "prefect agent start --name "Default Agent" --token ..." as described in the docs, I got the message that there could not established a connection due to that the target computer declined the connection .... what I did wrong? I see he tried to connect to local, but I would guess the agent should connect to prefect cloud, isn't it? thanks in advance! Michael
    ✅ 1
    j
    • 2
    • 4
  • v

    Varun Joshi

    03/24/2021, 11:04 AM
    Hi Prefecters, I ran across this issue where my flow failed because the log said 'cannot allocate memory'. We're running flows using a local agent on our VM. Does the error mean that the VM has run out of memory or is it the prefect cloud running out of memory?
    j
    • 2
    • 2
  • j

    Jacob Blanco

    03/24/2021, 11:17 AM
    Is there any way to disable a schedule set in code from cloud? Or set parameters per schedule in cloud?
    j
    j
    • 3
    • 4
  • j

    Jeffery Newburn

    03/24/2021, 2:35 PM
    We are running a typical ETL transform on a dataset in Prefect that does not fit in memory. We are looking at the best way to do multiple transformations on the data. Currently, we have 1 task that: • Queries the database • Transforms the data one record at a time • Saves each transformed record to a file Does Prefect have a good way to do this in multiple tasks without overrunning memory? Like Task A(Read data)->Task B(first transform)->Task C(second transformation)->Task D(write data)?
    s
    • 2
    • 2
  • a

    Aaron Richter

    03/24/2021, 2:52 PM
    Hi all! Does anyone have a good solution for aggregating worker logs when executing with a DaskExecutor + external cluster? When running with LocalExecutor, its nice to see all the logs for each task printed to stdout of wherever im running it. When running with an external Dask cluster, I need to go in to each worker's logs to see the logs for the task that the worker receives. It would be nice to still get a global sense of how all the tasks are running
    j
    s
    • 3
    • 3
  • i

    Irfan Habib

    03/24/2021, 4:16 PM
    Hi all! is it possible the chain flows? Flow B should run, on a daily schedule, only when Flow A has successfully completed (which is on a daily schedule too)?
    a
    • 2
    • 1
  • w

    Will Milner

    03/24/2021, 5:45 PM
    is creating tasks in a loop a good idea? say I have something like
    for x in range(3):
       task = some_task(x)
    
    final_task = another_task(upstream_tasks=task)
    I see 3 tasks get created in the loop, but for the final task it only has 1 upstream task, instead of all the tasks created in the loop
    k
    • 2
    • 7
  • c

    Charles Liu

    03/24/2021, 6:56 PM
    Anyone else deal with "No heartbeat" errors? Just curious ab possible solutions and causes other than lack of compute resources? I've been load testing overnight and two pipelines just failed suddenly after 600+ runs
    j
    c
    • 3
    • 9
  • c

    Charles Liu

    03/24/2021, 7:20 PM
    New context: The flow actually succeeded on the 4th attempt. Curious as to why this could happen! Thanks!
  • a

    Adam Lewis

    03/24/2021, 8:01 PM
    Hello everyone, I have a paused task in a task run, and I want to start it programatically (I imagine via Graphql). Do I just set the state to "Running" in graphql to start it again or something else? Anyone know?
    j
    • 2
    • 2
  • n

    Nathan Walker

    03/24/2021, 8:07 PM
    Does anybody have "How Prefect Works (TM)" diagrams (or videos or blogs)? I'm thinking something like https://docs.prefect.io/orchestration/server/architecture.html but more granular -- I'm trying to understand what's actually happening when a Flow is scheduled to run in Server, including how the Agent communicates with the Executor. Something like "A Flow/Task, from Start to Finish" would be mega helpful. I'm piecing things together from the docs and code, but if there's a hand-holding walkthrough or diagram available, I'm all ears.
    j
    • 2
    • 8
  • a

    Alex Papanicolaou

    03/24/2021, 10:57 PM
    Hi, regarding email cloudhook, would it be possible to send a simpler markdown-style email like the Slack cloudhook does? We’d like to route the email to Gitlab to create an Issue and the current email format is rendered very poorly on Gitlab.
  • m

    matta

    03/25/2021, 12:28 AM
    Is there a way to have Version Controlled Readmes for a flow? Like, these https://docs.prefect.io/orchestration/ui/flow.html#read-me
    n
    • 2
    • 1
  • u

    김응진

    03/25/2021, 7:25 AM
    Hi. My local agent seems not work. It just stopped at "Waiting for flow runs..." state. Is there any issues I should look into?
    a
    b
    • 3
    • 4
  • s

    Shin'ichiro Suzuki

    03/25/2021, 8:07 AM
    Hello. I've recently started using Prefect. The feeling is great. There's not much information in Japanese, so I'm hoping to cultivate it and activate the Japanese community as well.
  • v

    Varun Joshi

    03/25/2021, 8:33 AM
    Hi Prefecters, I'm running into this issue. The same code ran through on a different VM with a different local agent. When I switched to a new VM with same configuration and another I came across
    Failed to load and execute Flow's environment: AttributeError("'str' object has no attribute 'keys'")
    Any inputs will be much appreciated.
    • 1
    • 1
  • d

    Dave Hirschfeld

    03/25/2021, 8:51 AM
    I'm trying to understand the communication patterns between the different prefect server components. IIUC the UI component has to talk to the apollo graphql frontend via a publicly accessible IP. Do any of the other components need to talk to that endpoint via the public IP?
    a
    • 2
    • 2
Powered by Linen
Title
d

Dave Hirschfeld

03/25/2021, 8:51 AM
I'm trying to understand the communication patterns between the different prefect server components. IIUC the UI component has to talk to the apollo graphql frontend via a publicly accessible IP. Do any of the other components need to talk to that endpoint via the public IP?
a

Amanda Wee

03/25/2021, 8:59 AM
Nope. The agents do though.
👍 1
d

Dave Hirschfeld

03/25/2021, 9:12 AM
Great - thanks for the info Amanda!
View count: 1