https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
prefect-community
  • b

    Bartek

    05/13/2020, 12:29 PM
    Hi I have noticed that after restarting prefect server I have lost history and flows and I have to register all my flows once again. Is it possible preserve history and flows after stoping server?
    n
    • 2
    • 5
  • s

    Sandeep Aggarwal

    05/13/2020, 12:44 PM
    Hi All, Prefect newbie here. I am currently evaluating Prefect for a switch over from Airflow. A particular usecase I am struggling with is accessing result of upstream tasks in state handler of current task. In Airflow, I could achieve the same by querying the XCom for the upstream task instance. Also is there a way to access task using its ID or name inside state handler? Any help would be really appreciated. Thanks.
    👀 2
    l
    • 2
    • 9
  • b

    Bartek

    05/13/2020, 1:11 PM
    Hi I am experiencing issue with
    UI
    as it not showing any
    flows
    and
    runs
    . I register flow with success and see in agent logs and server logs that flow runs when is scheduled but I have no information about flows and runs in
    UI
    .
    a
    n
    • 3
    • 26
  • t

    Troy Sankey

    05/13/2020, 2:27 PM
    My flows use KubernetesJobEnvironment and I specify a custom job_spec, but I'm noticing that I need to manually delete the k8s job between flow runs or else subsequent runs will fail to create the k8s job due to the job already existing.
    n
    j
    • 3
    • 16
  • k

    Kostas Chalikias

    05/13/2020, 2:38 PM
    Hi there, I'm trying to understand how the zombie killer decides to mark a task as failed and by extension how the heartbeats are actually sent. We use the local daemon with cloud which I believe forks a process per flow, who is doing the heartbeating there?
    👀 1
    n
    z
    • 3
    • 7
  • m

    Matthias

    05/13/2020, 3:02 PM
    Hi, the following code crashes on me in Dask, due to
    Large object of size 5.49 MB detected in task graph:
    Am I doing something wrong? This is the simplest example I could come up with, that shows this behaviour.
    d
    j
    • 3
    • 26
  • s

    Scott Zelenka

    05/13/2020, 3:03 PM
    Looking for best practices around continuous delivery around Flows. Specifically, we have a Flow that's triggered from another system over GraphQL. We're currently triggering the Flow on the
    version_group_id
    . The execution of the Flow includes a step to write data back to a Prod API of an external system. The challenge comes when making iterations to this Flow in development. The external system's Dev instance still triggers our Flow through GraphQL, but is expecting it to write data back to a Dev API. We could update and deploy the Flow to Cloud, but then the
    version_group_id
    picks up our Dev Flow, rather than our Prod Flow. The only thing I can think of is to have two different Flows deployed on Cloud (one for Prod and another for Dev). But in that case, promoting from Dev to Prod has a bunch of manual steps prone to human error. Interested in the communities thoughts on how you handle deploying multiple versions of the same Flow between Dev and Prod environments, where the configuration between Dev and Prod are different.
    👀 1
    j
    n
    t
    • 4
    • 6
  • d

    Darragh

    05/13/2020, 3:06 PM
    Has anyone managed to get Prefect installed into a Docker container for the purpose of building Flows into Docker? I keep running around in circles with the D-in-D problem, my brain is creaking…
    s
    t
    • 3
    • 69
  • j

    John Ramirez

    05/13/2020, 3:12 PM
    hey everyone - weird question but does anyone have experience using Apache Spark. I’m investigating for a project best practices to run multiple parameter models on a single data set within a Spark cluster orchestrated with prefect. My main question is where to place the multiplier; would I get better performance to submit multiple job using
    .map()
    or submit a single job and manage running the different models within the single spark job.
    n
    • 2
    • 1
  • e

    Emmanuel Klinger

    05/13/2020, 3:41 PM
    Hi. I'm wondering what is the recommended way to run only parts of a flow. For example only tasks with certains tags.
    z
    n
    • 3
    • 6
  • k

    Kaz

    05/13/2020, 4:13 PM
    Hey all, has anyone experienced issues where every so often the flow doesn’t execute because it fails to load custom modules? I’m using a local agent and have added the correct import paths. My flow was working just fine for the first few runs. Now, it runs maybe 1/3 times because it fails to load one of my custom modules. This could be a supervisord issue as well, but I’ve been doing some digging around and I’m unable to pinpoint where things are going wrong. any and all help is appreciated!
    j
    • 2
    • 1
  • a

    Andy Waugh

    05/13/2020, 6:59 PM
    Hello 👋 can anyone help clarify my understanding how LOOPing and Context works between workers, especially (if it makes a difference) when using a distributed Dask cluster. Presumably the new task that is created is not necessarily executed on the same worker? And if not, can I safely rely on the newly LOOP’d task having the latest context? Here I’m specifically interested to ensure task_loop_result will reliably have the latest value but I am also generally interested to better understand how Context etc. is managed between workers. Hope that makes sense - feel free to point me at anything if this already explained elsewhere! Thanks! Andy
    👋 1
    c
    • 2
    • 3
  • j

    Jeremiah

    05/13/2020, 9:20 PM
    If you’re curious about Vue or front-end development, @nicholas is going to give @Laura Lorenz (she/her) an introduction to Vue in this Friday’s live stream — their goal will be to add a new tile the open-source UI from scratch, resulting in a new PR for Prefect Server! All experience levels welcome, you can sign up here: https://www.meetup.com/Prefect-Community/events/270547519
    🙏 3
    🚁 1
    🚀 4
    👏 7
    🎨 1
    m
    • 2
    • 2
  • m

    matta

    05/13/2020, 11:20 PM
    What's the best way to pass an object containing credentials? I'm making an ETL to get stuff out of Google Sheets, and I'm using the
    gspread
    package, which has you do everything from method calls to an authentication object. So like you go
    gc = gspread.service_account(filename=<filename>)
    and point it at a special credentials file, then everything is through that. Should I just pass it to Secrets? Is there any risk of sensitive credentials being cached somewhere if I define the
    gc
    object within the Flow itself?
    n
    j
    l
    • 4
    • 37
  • d

    Darragh

    05/14/2020, 10:20 AM
    Has anyone come across a case where you build a flow locally, and the register step is configured to register to a remote server (private prefect instance on aws, not prefect cloud). The output give me a url where the flow should be registered on that server, but there's nothing in the UI..
    👀 1
    d
    s
    n
    • 4
    • 92
  • s

    Simon Basin

    05/14/2020, 2:46 PM
    Hello! - Task question: is it possible to 1. re-run a task w/o triggering downstream dependencies? 2. do the same with ad-hoc task parameters?
    👀 1
    l
    • 2
    • 5
  • c

    Christopher Harris

    05/14/2020, 4:26 PM
    Does prefect support micro-batching? In more detail: We’re trying to migrate our existing generator pattern to prefect - and we’re hoping we can change to a micro-batching model. Basically, our first node in our pipeline is responsible for pulling data from a location and pushing it out to the rest of the pipeline (a DAG). We were hoping to use the LOOP construct to have that “source node” pull data in
    batch_size
    increments, and map the individual data packets across the remaining DAG. In a way this kind of seems like a “workflow loop” with the parameters for the first node constantly updating.
    👀 1
    d
    • 2
    • 9
  • j

    Julia Eskew

    05/14/2020, 5:46 PM
    Hi! Is there a regular release schedule for Prefect? I'd like to install a Prefect version that has the snowflake connector bump that's here: https://github.com/PrefectHQ/prefect/blob/master/setup.py#L56
    j
    • 2
    • 2
  • t

    tkanas

    05/14/2020, 7:26 PM
    For a given flow, if a task fails I am wanting to write the state to my local machine, and be able to load it back up later. I am looking into state handlers and persisting output using
    LocalResult
    but I'm wondering if there are Prefect features that are particularly appropriate for this use case.
    👀 1
    d
    • 2
    • 18
  • m

    Matthias

    05/14/2020, 8:05 PM
    Hi! Is there an easy way to make a task the last or first task of a flow explicitly?
    👀 1
    d
    j
    • 3
    • 11
  • d

    Dan DiPasquo

    05/14/2020, 9:23 PM
    Trying to understand why some flows die due to Zombie Killer when other, and then others, similar, don't. Sometimes see tasks succeed after 5 minutes or more without being killed, while others killed after 3 mins or so. With Zombie Killer enabled, does every @task wrapped function need to complete in < 2 minutes or risk being terminated? For example, if a task includes a subprocess.run() that does some long running work, is the heartbeat blocked until subprocess.run completes?
    👀 1
    d
    c
    • 3
    • 29
  • m

    Matthew Maldonado

    05/14/2020, 10:02 PM
    I'm on windows and prefect usually starts on restart just fine. However, I accidentally ran prefect server start again and now no flows show up on the UI. I even tired registering them again. Should i reinstall prefect? Flows look like they might be registering in the cli but they are not showing up on the ui. Prefect cloud seems to work fine though.
    l
    • 2
    • 1
  • m

    Matthew Maldonado

    05/14/2020, 11:27 PM
    SO I registered my flows in prefect cloud. I tried to run one on demand. However, It is not running. It says running late. Am I missing a set up step here?
    c
    • 2
    • 8
  • j

    Joe Schmid

    05/15/2020, 3:05 AM
    Question on testing and referring to task results. This example is contrived but maybe useful for discussion. Simplified class that defines a flow:
    @task
    def times_two(x):
        return x * 2
    
    @task
    def add(items):
        return sum(items)
    
    class SimpleFlow(SRMPrefectFlow):
        @property
        def flow(self) -> Flow:
            with Flow("SimpleFlow", environment=env) as flow:
                x = Parameter("x", default=[1, 2, 3])
    
                times_two_task_result = times_two.map(x)
                flow_result = add(times_two_task_result)
            return flow
    And a simple test to run the flow & check the last task's result:
    def test_flow_run_result():
        flow = SimpleFlow().flow
        fr = flow.run()
        assert list(fr.result.values())[2].result == 12
    The
    list(fr.result.values())[2].result
    works, but is fragile. We'd rather
    fr.result[flow_result].result
    but
    flow_result
    isn't available outside of the function that defines the flow. Is there a better approach that people have used?
    j
    i
    • 3
    • 13
  • b

    Barry Roszak

    05/15/2020, 8:07 AM
    Hi Is it possible to have dynamic
    tags
    ? If I us map in one task and I want to limit operation besed on input to the task?
    j
    • 2
    • 5
  • c

    Cab Maddux

    05/15/2020, 2:03 PM
    Hi, we're seeing some issues with Prefect cloud, none of our registered flows are available in the cloud UI as of about 30 minutes ago
    z
    j
    +3
    • 6
    • 15
  • p

    Pierre CORBEL

    05/15/2020, 3:54 PM
    Hello, Just to let you know that since
    prefect 0.11.0
    , you can't use the enviuronment variable
    PREFECT__CLOUD__AGENT__LABELS=["hello"]
    anymore but you have to wrap the values in single quote like
    PREFECT__CLOUD__AGENT__LABELS='["hello"]'
    I know it is documented with the right format in the doc but it was working with a "bad" format before and so it can broke your flow after upgrading to v0.11.0 👍
    :upvote: 4
  • w

    Will Milner

    05/15/2020, 4:47 PM
    I'm getting a weird error from the apollo service whenever I try and run a flow. The error I see in the logs is this
    apollo_1     | 2020-05-15T16:42:41.530Z {"message":"Cannot query field \"setFlowRunStates\" on type \"Mutation\". Did you mean \"set_flow_run_states\" or \"set_task_run_states\"?","locations":[{"line":2,"column":5}],"extensions":{"code":"GRAPHQL_VALIDATION_FAILED"}}
    
    apollo_1     | 2020-05-15T16:42:41.547Z {"message":"Unknown type \"writeRunLogsInput\". Did you mean \"write_run_logs_input\", \"write_run_log_input\", \"create_flow_input\", or \"archive_flow_input\"?","locations":[{"line":1,"column":18}],"extensions":{"code":"GRAPHQL_VALIDATION_FAILED"}}
    Not sure how I go about debugging this, any tips?
    t
    j
    • 3
    • 13
  • l

    Laura Lorenz (she/her)

    05/15/2020, 7:42 PM
    Hi all! @nicholas and I are queued up for a livestream today on how to contribute to Prefect Core’s open source UI in 15 minutes! Feel free to drop by, or go to this link after for the recording if you can’t make it!

    https://www.youtube.com/watch?v=YHqfJwFvTFY▾

    :marvin: 4
    🚀 4
    😍 3
    d
    n
    • 3
    • 9
  • b

    Brad

    05/16/2020, 12:41 AM
    Hi team, I’m just playing around with the new result class; it looks really nice so far! Would it be possible to pass the inputs to the read/write functions also? I’d like to parameterise my filenames/targets per input. And another question, according to the docs these results are persisted to the prefect database, should I expect to see this table in the GraphQL api ?
    c
    • 2
    • 15
Powered by Linen
Title
b

Brad

05/16/2020, 12:41 AM
Hi team, I’m just playing around with the new result class; it looks really nice so far! Would it be possible to pass the inputs to the read/write functions also? I’d like to parameterise my filenames/targets per input. And another question, according to the docs these results are persisted to the prefect database, should I expect to see this table in the GraphQL api ?
c

Chris White

05/16/2020, 12:50 AM
Hey Brad! All great questions; there isn’t a first-class way to parametrize by inputs but I think that’s a reasonable request — if you open an issue for it we can take a look at how to support that! The biggest caveat will be that some input string representations won’t work well with filename templating but we can put some reasonable constraints on that
To extract the result locations from the GraphQL API right now is a little messy, but if you query for the
serialized_state
attribute of a task run you should be able to find the info you’re looking for. We’re going to make that more convenient in the very near future (along with better UI representations of results)
🎊 1
And actually you just reminded me I should announce the 0.11 release in #announcements !
b

Brad

05/16/2020, 12:55 AM
Re point 1 - Yep agreed not every input will have a good str repr, but if it does the user has the choice to include it.
obviously if you try and template a random object

https://www.autodesk.com/products/fusion-360/blog/wp-content/uploads/2016/09/youre-gonna-to-have-a-bad-time.jpg▾

😂 1
What I’ve actually been doing is taking the module path + task name + hash of the cloud-pickled inputs to make a unique but deterministic filename
which works quite well
(if you have for example, multiple flows/users using common tasks)
c

Chris White

05/16/2020, 1:00 AM
i’m impressed by how quickly you’re exercising that API! I love it
b

Brad

05/16/2020, 1:00 AM
but this requires the new Result class to pass the inputs through
c

Chris White

05/16/2020, 1:00 AM
yea
yea definitely open an issue; it can be our first enhancement on the API 😄
b

Brad

05/16/2020, 1:01 AM
doing it as we speak
💯 1
https://github.com/PrefectHQ/prefect/issues/2577
c

Chris White

05/16/2020, 1:09 AM
awesome i appreciate it
View count: 1