• d

    Daniel Davee

    1 year ago
    How can get why flow failed? all get is it didn't work
    d
    Kevin Kho
    3 replies
    Copy to Clipboard
  • Diego Alonso Roque Montoya

    Diego Alonso Roque Montoya

    1 year ago
    Is there a Warning status in flows? our use case is that we want to mark something as successful for the time being but needing review when an engineer is available
    Diego Alonso Roque Montoya
    Kevin Kho
    3 replies
    Copy to Clipboard
  • a

    Arkady K.

    1 year ago
    Hi all ! any docs we can find on how to create a tenant in a k8s cluster, our cluster got rebooted over the weekend and we lost our default tenant, one of our team mates is out on vacation, so need help creating a default tenant
    a
    Kevin Kho
    +1
    10 replies
    Copy to Clipboard
  • nick vazquez

    nick vazquez

    1 year ago
    Hi! I am new to using prefect but I am trying to utilize my dask cluster and having issues doing so on-prem. I ran into issues where the workers were looking to submit logs to
    localhost:4200
    although the prefect server/agent were running on a dedicated scheduler box. Am I missing something for configuring the workers to work properly? Do they need to point back at the scheduler's ip for logging? Do I need to run an agent on each machine to pass jobs to the workers?
    nick vazquez
    Kevin Kho
    +1
    15 replies
    Copy to Clipboard
  • Lukáš Polák

    Lukáš Polák

    1 year ago
    Hi everyone! We would like to run our Prefect Server in AWS infrastructure. Right now, we are debating whether to use normal AWS RDS for DB or go with Aurora Serverless. Does anybody have any experience running their Prefect on either of those types of DB? I'm mostly keen to know, if you experience problems with max connections limit.
    Lukáš Polák
    1 replies
    Copy to Clipboard
  • Tom Forbes

    Tom Forbes

    1 year ago
    The docs on output caching say that cached states will be stored in memory when running prefect core locally. This seems like a strange limitation, it’s quite handy during debugging to just do
    flow.run()
    , but not having any caching is anoying. Is there a way to work around this to enable caching for locally run tasks?
    Tom Forbes
    1 replies
    Copy to Clipboard
  • Tom Forbes

    Tom Forbes

    1 year ago
    And I’m slightly confused about how Result’s are expected to interact with external libraries like Dask. I’d like to save my Dask dataframe to Parquet somewhere, depending on what result is configured. Each Task has a unique storage location which can be a local directory or a S3 prefix, so how would I get this inside my task? I’d like to do:
    @task()
    def save_results(dataframe):
        dataframe.save_parquet(UNIQUE_TASK_LOCATION)
        return UNIQUE_TASK_LOCATION
    or somesuch. But results seem to be centred around handling the writing/pickling of data for you? Ideally I’d not like to care if it’s a S3 prefix (for production) or a local directory (for debugging).
    Tom Forbes
    Kevin Kho
    +2
    42 replies
    Copy to Clipboard
  • Stéphan Taljaard

    Stéphan Taljaard

    1 year ago
    Hi. With a brand-new install, the order of operations is1.
    prefect backend server
    2.
    prefect server start
    If I don't want "default" displayed on the UI (i.e. I just need one tenant, but don't want it named "default"), I need to add a new tenant. This is done through 3.
    prefect server create-tenant --name "Some Other Name"
    This now creates a new, additional, tenant. I only need one... Ideally, the steps would be, from above, 1 then 3 then 2. However, that order doesn't work because the server (database) has to be up for cmd 3 to work. Would it be useful to have the default tenant name as optional argument to
    prefect server start
    ? Or am I missing something w.r.t. the creation of my tenant?
    Stéphan Taljaard
    Michael Adkins
    +1
    6 replies
    Copy to Clipboard
  • Garret Cook

    Garret Cook

    1 year ago
    I have a flow scheduled to run every 15 minutes. Occasionally the run is delayed, and in that case, I’d like Prefect to skip missed runs of this particular flow (rather than running missed runs trying to catch up). Is there a built in way to accomplish this?
    Garret Cook
    Kevin Kho
    7 replies
    Copy to Clipboard
  • c

    Chohang Ng

    1 year ago
    from prefect import Task
    import pandas as pd
    import os,sys
    from prefect.utilities.tasks import task
    import db as db
    from prefect import task, Flow, Parameter
    import prefect
    from prefect.run_configs import LocalRun
    from prefect.executors import LocalDaskExecutor
    class ETL(Task):
        def __init__(self):
            self.df  = self.extract()
        def extract(self):
            read_conn = db.read_conn
            query ="""SELECT b.oproduct_id, p.oproduct_id,p.oproduct_parent_id,b.obundle_parent_id
    from hq.oproducts p 
    JOIN hq.obundles b ON b.oproduct_id = p.oproduct_id
    WHERE b.oproduct_id = 5801"""
            df = pd.read_sql(query,read_conn)
            return df
        def load(self):
            self.df.to_csv(r"C:\Users\<http://cho.ng|cho.ng>\test\df.csv",index=False)
    with Flow('flow_3',executor=LocalDaskExecutor(), run_config=LocalRun()) as flow:
        df = ETL()
        df.load()
    flow.register(project_name="tester")
    c
    Kevin Kho
    18 replies
    Copy to Clipboard