https://prefect.io logo
Docs
Join the conversationJoin Slack
Channels
announcements
ask-marvin
best-practices-coordination-plane
data-ecosystem
data-tricks-and-tips
events
find-a-prefect-job
geo-australia
geo-bay-area
geo-berlin
geo-boston
geo-chicago
geo-colorado
geo-dc
geo-israel
geo-japan
geo-london
geo-nyc
geo-seattle
geo-texas
gratitude
introductions
marvin-in-the-wild
prefect-ai
prefect-aws
prefect-azure
prefect-cloud
prefect-community
prefect-contributors
prefect-dbt
prefect-docker
prefect-gcp
prefect-getting-started
prefect-integrations
prefect-kubernetes
prefect-recipes
prefect-server
prefect-ui
random
show-us-what-you-got
Powered by Linen
show-us-what-you-got
  • e

    Emma Rizzi

    08/26/2021, 8:52 AM
    Hello ! a few months ago I had issues with deploying Prefect Server with a ECS and a private Docker storage. As its all working well now I wrote a tutorial on the steps I had to follow, hoping it might help anyone needed this particular configuration too šŸ™‚ Here's the article : https://towardsdatascience.com/deploying-prefect-server-with-aws-ecs-fargate-and-docker-storage-36f633226c5f
    ā¤ļø 3
    šŸŽ‰ 9
    šŸ‘‹ 7
    k
    w
    • 3
    • 5
  • h

    Henning Holgersen

    08/28/2021, 1:34 PM
    So… this is pretty much the opposite of ā€œwhat I gotā€, but: I’m lucky enough to start from zero with prefect and I know it is easy to forget what we found to be confusing, so in my ā€œprefect hello worldā€ repo I made sure the readme included my thoughts as I created an account and went along with the getting started resources. Just in case anyone finds it useful. It is an honest attempt to record my first impression: https://github.com/radbrt/prefect_pipeline. I’m happy to add to the documentation once I get a little experience.
    :upvote: 6
    m
    • 2
    • 4
  • d

    davzucky

    09/17/2021, 12:56 AM
    Just did a presentation in my company about prefect. All the slide (as Jupyter notebook) and code is available on my github. Please enjoy and feel free to use it if you have a presentation to do. https://github.com/davzucky/prefect_presentation
    :upvote: 12
    ā¤ļø 9
    k
    l
    • 3
    • 7
  • r

    Richard Pelgrim

    09/24/2021, 2:14 PM
    Just published a blog about connecting Prefect to a Dask cluster using Coiled for processing larger-than-memory Tasks and/or Flows. https://towardsdatascience.com/scaling-your-prefect-workflow-to-the-cloud-2dec4e0b213b
    šŸš€ 15
    šŸ‘ 7
    :marvin: 8
    i
    a
    t
    • 4
    • 11
  • a

    Anna Geller (old account)

    09/29/2021, 7:10 PM
    Here is an updated AWS ECS Fargate walkthrough:Ā https://towardsdatascience.com/how-to-cut-your-aws-ecs-costs-with-fargate-spot-and-prefect-1a1ba5d2e2df.Ā  It can be useful, if: • you face some issues configuring your PrefectĀ ECS agentĀ (especially permissions), • you want to use spot instances to save costs,Ā  • you want to automate agent deployment for different environments (staging, dev, prod).
    šŸ‘ 11
    šŸŽ‰ 11
    :party-parrot: 9
    f
    k
    a
    • 4
    • 4
  • a

    Aldo Escobar

    11/04/2021, 8:55 PM
    Hello! I want to share with you a blog post about SoaM, a Prefect based library created by Mutt Data. We use a lot that forecasting framework because we defined a base code based on the experience of our previous projects. https://www.linkedin.com/posts/mutt-data_soam-public-release-activity-6843922592252932096-kM8u
    :prefect: 2
    šŸš€ 8
    šŸ’Ŗ 1
    :marvin: 5
    j
    z
    • 3
    • 2
  • k

    Khuyen Tran

    11/09/2021, 5:30 PM
    Hi everybody, I have just written an article on how to orchestrate a data science project with Prefect: https://towardsdatascience.com/orchestrate-a-data-science-project-in-python-with-prefect-e69c61a49074#67c1-8f85fb1cfe73
    šŸš€ 4
    :marvin: 16
    šŸ‘ 6
    šŸ‘ 2
    šŸ™Œ 3
    :upvote: 14
    y
    • 2
    • 2
  • p

    psimakis

    11/13/2021, 6:02 PM
    Hey everyone! I implemented this Sphinx plugin that aims to simplify the process of inserting prefect flow visualizations into a Sphinx project. If you are using both Prefect and Sphinx, feel free to give it a go. Any feedback is more than welcome. There is also an article/demo about this plugin. šŸ˜›refect: + Sphinx = ā¤ļø
    šŸš€ 5
    :marvin: 7
    k
    j
    • 3
    • 2
  • a

    Anna Geller

    11/18/2021, 4:01 PM
    If you want to orchestrate several flows with Prefect šŸ˜›refect:, check out our recent blog post covering the entire ELT lifecycle, from building independent ingestion workflows through data transformations with dbt :dbt: up to triggering downstream flows that use data for reporting and analytics: https://medium.com/the-prefect-blog/orchestrating-elt-with-prefect-and-dbt-a-flow-of-flows-part-1-aac77126473
    šŸ™Œ 3
    šŸš€ 6
    :marvin: 9
    šŸ’Æ 2
    :dbt: 8
    :upvote: 14
    s
    • 2
    • 1
  • c

    Chris Arderne

    11/23/2021, 11:15 AM
    I struggled a bit getting to terms with Prefect's terminology and how cloud/agents/flows etc integrate, but got some great help from the champions on this forum. Now that things are humming along I decided to write up what I figured out (focused on the infrastructure side of things, rather than writing Flows). Hopefully it's mostly correct and hopefully it's useful for someone! https://rdrn.me/scaling-out-prefect/
    šŸ™Œ 17
    šŸš€ 14
    šŸ’Æ 11
    a
    z
    +3
    • 6
    • 5
  • j

    Jarek Lukow

    11/26/2021, 2:01 PM
    We might be a little bit crazy, but we’re doing some business process management using Prefect. Here’s a 20 min presentation about how we integrated Prefect with our Developer Portal at Box (much more talking about the usecase than Prefect itself, but still I think it’s worth sharing šŸ™‚):

    https://www.youtube.com/watch?v=apCDT3_DmFk&t=438sā–¾

    šŸ’Æ 9
    šŸš€ 9
    j
    j
    +2
    • 5
    • 6
  • c

    Constantino Schillebeeckx

    12/06/2021, 7:26 PM
    Hi everyone, I put together a sphinx plugin to make documenting your tasks much easier - could be interesting to use in conjunction with this too.
    šŸ”„ 8
    :marvin: 4
    k
    • 2
    • 1
  • m

    Mathijs Miermans

    12/13/2021, 10:57 PM
    Edit: Wrong channel, sorry! This should have gone in #prefect-community. I'm trying to use the stepactivate task to execute an AWS step function, and it's failing on the second flow run, because
    execution_name
    is not unique. We're trying to set
    execution_name
    to a uuid4 at run time, but that is probably happing at registration time. How can task arguments be computed dynamically at run time?
    k
    • 2
    • 16
  • a

    Ari Bajo

    12/14/2021, 10:26 PM
    For those who couldn't attend the last Airbyte Community Call, here is a tutorial sharing how to integrate Prefect, Airbyte and dbt by @alex šŸ™‚ https://airbyte.io/recipes/elt-pipeline-prefect-airbyte-dbt
    :upvote: 9
    šŸš€ 5
    :dbt: 5
    ā¤ļø 5
  • a

    Alyssa Mazzina

    12/21/2021, 7:24 PM
    Hey everybody - issue #2 of our newsletter (Prefect’s Guide to the Galaxy) just dropped, and you can read itĀ here. Shout out to @Sylvain Hazard for being the featured Slack community post. If you want to sign up you can do so atĀ prefect.io/#newsletter! Happy holiday engineering!
    :marvin: 7
    :upvote: 5
    šŸ‘ 8
    s
    • 2
    • 1
  • a

    Aram Panasenco

    12/29/2021, 9:24 PM
    Automatically uploading a flow's Python docs to the flow's README in the cloud UI Here's a Python script I wrote to both register a flow in Prefect and also upload its pdoc3-generated documentation to the flow's README. Usage:
    python register.py -p my_project -m module.that.contains.flow.variable
    Source of register.py:
    from types import ModuleType
    from typing import Union
    
    import click
    import pdoc
    from prefect import Client
    from prefect.utilities.graphql import with_args
    
    def docfilter(docobject):
        print(docobject)
    
    @click.command()
    @click.option(
        "--project",
        "-p",
        help="The name of the Prefect project to register this flow in. Required.",
        required=True,
    )
    @click.option(
        "--module",
        "-m",
        help="A python module name containing the flow to register. Required.",
        required=True,
    )
    def register_flow(project: str, module: Union[ModuleType, str], **kwargs):
        pdoc3_module = pdoc.Module(module=module)
        flow = pdoc3_module.obj.flow 
        flow_id = flow.register(project_name=project, **kwargs)
        client = Client()
        flow_group_id = client.graphql(
            {
                "query": {
                    with_args(
                        "flow",
                        {
                            "where": {
                                "id": {
                                    "_eq": flow_id,
                                },
                            },
                        }
                    ): {
                        "flow_group_id"
                    }
                }
            }
        )["data"]["flow"][0]["flow_group_id"]
        docfilter = lambda doc: doc.name in [t.name for t in flow.tasks]
        pdoc3_module = pdoc.Module(module=module, docfilter=docfilter)
        module_markdown = pdoc3_module.text()
        # Do some post-processing to keep Prefect engine from misinterpreting
        # the markdown.
        module_markdown = module_markdown.replace(":   ",":\n").replace("\n:",":").\
                          replace("\n    ","\n")
        ret = client.graphql(
            {
                "mutation": {
                    with_args(
                        "set_flow_group_description",
                        {
                            "input": {
                                "flow_group_id": flow_group_id,
                                "description": module_markdown,
                            },
                        }
                    ): {
                        "success"
                    }
                }
            }
        )
        if not ret["data"]["set_flow_group_description"]["success"]:
            raise RuntimeError("Failed to set flow group README")
    
    
    if __name__ == "__main__":
        register_flow()
    šŸ‘€ 6
    ā¤ļø 3
    :upvote: 9
    k
    m
    • 3
    • 3
  • c

    Christoph Deil

    01/04/2022, 9:15 PM
    I’ve started a Prefect tutorial to introduce it to my colleagues tomorrow: https://github.com/cdeil/prefect-tutorial It’s only half or less finished, but I thought I’d mention it in case someone is interested or even would like to collaborate on it. Specifically I’d be interested if there’s already a diagram or docs that explain with FlowRunner, TaskRunner etc when
    flow.run()
    with the default serial executor happens or where the algorithm is that linearises the task graph to decide who runs after who. And also how schedulers work, i.e. if they have at the very core some polling for loop in-process or if the scheduler runs in some other thread or process. For now I’m not planning to look into Dask, just trying to understand how the serial execution works under the hood. Does this information exist in some tutorial form? Or alternatively could you please point me to the few relevant parts in the code or tests to quickly understand how that works?
    k
    g
    +2
    • 5
    • 16
  • a

    Alyssa Mazzina

    01/04/2022, 11:07 PM
    Hello! The latest Prefect Guide to the Galaxy newsletter is out. You can read it here if you aren’t signed up to receive it via email. And this week’s shoutout goes to @Yusuf Khan, our featured Slack community poster!
    šŸ˜ 3
    :marvin: 4
    šŸš€ 2
    :upvote: 6
  • s

    Scott Treloar

    01/05/2022, 10:25 AM
    šŸ‘
  • j

    Joshua S

    01/10/2022, 3:14 PM
    Good Morning, I wanted to share a Service we here at Softrams created for Prefect. Our customers and use cases currently require Authorization to secure the runs and access to Prefect. Please check out our open-sourced service!Ā https://github.com/softrams/prefect-auth-proxy
    šŸ‘€ 1
    šŸ™Œ 6
  • a

    Alyssa Mazzina

    01/18/2022, 11:29 PM
    New edition of the Prefect Guide to the Galaxy sent today! Read it here, then sign up to receive the emails! https://prefect.notion.site/Guide-to-the-Galaxy-1-18-2022-9b3de20f40bf49f18ae6976646b29668
  • a

    Alyssa Mazzina

    01/18/2022, 11:30 PM
    Shoutout to @Nash Taylor, our featured community poster!
    :marvin: 7
  • z

    Zach Schumacher

    01/19/2022, 8:02 PM
    hey all - wanted to share a lib (though currently still in alpha) I just released called pydapper. It is inspired by the NuGet lib dapper and its intent is to be a simple object mapper that sits on top of the dbapi 2.0 interface. Would love some feedback! https://pydapper.readthedocs.io/en/latest/
    šŸš€ 4
    a
    • 2
    • 2
  • k

    Kirk Quinbar

    01/24/2022, 5:38 PM
    I recently created a supplemental blog for productionizing prefect flows with azure and docker. It goes along with a similar one that @Kevin Kho wrote but gets into the details of setting up an azure container registry and a single VM with Prefect Agent and docker. In our scenario, we dont need the overkill and cost of Kubernetes as all of our tasks as just making simple api calls to orchestrate an ELT process. Anyways thought my blog might be of use to others who want to setup a Prefect Agent and deploy code, including dependencies, via Docker in Azure. I had a hard time finding a complete walk-through specific to Azure which is why i wrote this one. https://medium.com/@kquinbar/productionizing-prefect-flows-with-docker-and-azure-47d96542ff9d
    šŸš€ 12
    šŸ’Æ 7
    c
    k
    j
    • 4
    • 4
  • b

    Ben Welsh

    01/29/2022, 9:38 PM
    Thanks to @Kevin Kho's pro advice, this week I figured out how to deploy a Prefect cloud agent to Google Kubernetes Engine. I wrote it up as a simple tutorial here. I'd love it if some pros gave it a look and offered any pointers.
    :marvin: 6
    j
    k
    • 3
    • 4
  • b

    Ben Welsh

    01/29/2022, 9:38 PM
    https://gist.github.com/palewire/072513a9940478370697323c0d15c6ec
  • k

    Kevin Kho

    02/07/2022, 4:54 PM
    Our friends at Coiled also released their own Discourse to discuss Dask issues (not limited to Coiled)!
    šŸ‘ 1
    :dask: 5
  • k

    Kevin Kho

    02/10/2022, 8:48 PM
    message has been deleted
    :upvote: 2
    šŸ‘ 2
    šŸ˜ 4
    j
    • 2
    • 1
  • k

    Khuyen Tran

    02/19/2022, 3:14 PM
    Here is the GitHub repository for my presentation "Jupyter Notebook to production-ready code" in the Women in Data Science workshop The project includes Prefect, Hydra, and Weights & Biases. It is organized by steps so that you can understand why I use a particular tool. If you want to get a better understanding of the project, you can attend the workshop today for free at 11:30 am PST
    ā¤ļø 1
    šŸ‘€ 1
    šŸ™Œ 7
    šŸ‘ 1
    :marvin: 3
    :upvote: 3
    j
    • 2
    • 1
  • e

    Evgeniya Sukhodolskaya

    03/04/2022, 3:46 PM
    Hi, Prefect community! I am a data evangelist from toloka.ai - a crowdsourcing data labeling platform. I want to share with you our work on integration with Prefect which aims to help Big Data and Machine Learning engineers painlessly create data gathering & cleaning pipelines. Our engineering team created a toloka-prefect python package to orchestrate crowdsourcing pipelines in Prefect. Now, with this integration and due to Prefect failure management abilities, if you need to solve a task of collecting huge various amounts of data, or validate your existing dataset, you can accomplish it without headache related to loosing control over crowd. Let me continue in thread:) P.S. A question on my behalf: are there cases of using Prefect for creating Machine Learning pipelines?
    āœ… 2
    šŸš€ 7
    šŸ’Æ 2
    :upvote: 4
    a
    • 2
    • 2
Powered by Linen
Title
e

Evgeniya Sukhodolskaya

03/04/2022, 3:46 PM
Hi, Prefect community! I am a data evangelist from toloka.ai - a crowdsourcing data labeling platform. I want to share with you our work on integration with Prefect which aims to help Big Data and Machine Learning engineers painlessly create data gathering & cleaning pipelines. Our engineering team created a toloka-prefect python package to orchestrate crowdsourcing pipelines in Prefect. Now, with this integration and due to Prefect failure management abilities, if you need to solve a task of collecting huge various amounts of data, or validate your existing dataset, you can accomplish it without headache related to loosing control over crowd. Let me continue in thread:) P.S. A question on my behalf: are there cases of using Prefect for creating Machine Learning pipelines?
āœ… 2
šŸš€ 7
šŸ’Æ 2
:upvote: 4
In Toloka, each labeling pipeline may consist of several projects created by requesters in which tasks of a particular nature are solved with the help of a diverse crowd from all over the world. Considering the light barrier to entry and since markup of each task is paid by a requester, any failure in the pipeline leads to money loss. Hence, such Prefect semantics as сaching and persisting data became a key to the vast improvement & budget preservation! We conducted a talk

Launching human-in-the-loop process on Toloka using Prefectā–¾

based on the popular example of a data-labeling task and want to share it with you. We are super happy to be part of a Prefect community and looking forward to deepening our collaboration:) If you have any questions or feedback regarding the integration, I will be happy to comment on them in the thread here. If you want to share your pain&ideas&proposals with our engineering team directly, you’re welcome to join our Toloka Global Community.
a

Anna Geller

03/04/2022, 4:17 PM
Hi @Evgeniya Sukhodolskaya, welcome to the community, great to have you with us! šŸ‘‹ Thank you so much for contributing and this excellent notebook explaining how to use this integration with Prefect Cloud! šŸ‘ I will cross-post it on Discourse and I'll make sure to recommend it to any users asking about data labeling use cases for ML. To answer your question: Prefect is a general-purpose workflow orchestration platform that supports basically all data-flow automation use cases you can think of, definitely including ML pipelines! Thanks again for sharing and have a wonderful weekend!
šŸ™Œ 2
šŸ‘€ 1
View count: 3