Hi all I really like using Prefect except for one missing co Prefect Community #ask-community

Hi all, I really like using Prefect except for one...

Baker

02/22/2024, 10:17 PM

Hi all, I really like using Prefect except for one missing concept, which is the orchestration of deployments. If I want to run a DAG of flows, each on its own deployment (because I want each to have its own container and deployment configuration), there is not much to help me. I need to have a parent flow that call run_deployment, track the state of each, pull and pass around results. Potentially with some asyncio on top to make it parallel when possible. I which we could use the same awesome API as for Tasks. Am I completely missing a key architecture that would solve my problem or is it just the way it is with Prefect?

👀 1

Kevin Grismore

02/22/2024, 10:23 PM

I hear ya on this one. I think one thing we could do to make this easier is provide some kind of utility that shortcuts a lot of the verbose code writing it takes to get a dag-like structure of deployments, so you can worry about defining your dependencies and parameters and little else. Does that feel like it'd suit what you're looking for?

Baker

02/22/2024, 10:36 PM

I see two directions: 1. Extend the subflow concept to make it work with deployed flows. 2. Develop a yaml representation of the DAG, à la Argo Workflows.

Kevin Grismore

02/22/2024, 10:37 PM

yeah, I threw something like this together where I wrote some in-python dictionaries to define my dag of deployments, which is then sorted through and runs deployments via concurrent tasks. the outcome is nice but the interface feels very not-prefect if that makes sense

Kevin Grismore

02/22/2024, 10:38 PM

"subflow" is a confusing proposition to someone who wants a DAG and gets an in-process thing

Baker

02/22/2024, 10:40 PM

In-process and sequential, without .submit(). I've felt deceived by the subflow concept. 🥲

Baker

02/22/2024, 10:42 PM

This absence of DAG for deployments is a critical miss in my team. I'm pretty sure that Prefect will be rejected because of that, in the end. It's too bad, because on the other hand it is really nice to use for each flow. It's just that putting them together is a hassle.

Kevin Grismore

02/22/2024, 10:44 PM

I think I'm about to pour a nontrivial amount of energy into making this a thing

Kevin Grismore

02/22/2024, 10:46 PM

in my example, the definition looks like this:

Copy code

tasks = [
        {
            "task": "Task1",
            "depends_on": [],
            "flow": "flow1",
            "deployment": "deployment1",
            "work_pool": "work_pool1",
        },
        {
            "task": "Task2",
            "depends_on": ["Task1"],
            "flow": "flow2",
            "deployment": "deployment2",
            "work_pool": "work_pool2",
        },
        {
            "task": "Task3",
            "depends_on": ["Task1"],
            "flow": "flow3",
            "deployment": "deployment3",
            "work_pool": "work_pool3",
        },
        {
            "task": "Task4",
            "depends_on": ["Task2", "Task3"],
            "flow": "flow4",
            "deployment": "deployment4",
            "work_pool": "work_pool4",
        },
    ]

Kevin Grismore

02/22/2024, 10:47 PM

and running it gets you this. (it isn't actually calling run_deployment but let's pretend)

Kevin Grismore

02/22/2024, 10:48 PM

is that at all appealing to what you're looking for

Baker

02/22/2024, 10:51 PM

It's getting there indeed. But it needs some inputs management, and ideally outputs as well.

Kevin Grismore

02/22/2024, 10:51 PM

yup, it's definitely missing that

Kevin Grismore

02/22/2024, 10:51 PM

but doable

Baker

02/22/2024, 10:51 PM

Outputs are secondary as we can declare the output location in the input, but would be nice to have.

Baker

02/22/2024, 10:52 PM

But you have to find another name than Task, I think! 😄

Kevin Grismore

02/22/2024, 10:53 PM

haha I put this together late at night

Kevin Grismore

02/22/2024, 10:53 PM

but I'll try to get a repo with a good example of it out there within a few days. how/if it actually manifests in Prefect itself is a whole other story, but I think it's good to have in case you find it useful

👍 1

Baker

02/22/2024, 10:55 PM

To be honest I'm surprised that Prefect went that far without this. Made me wonder how others were doing, or if we were doing something very exotic (but no, because other orchestrator have this concept).

Baker

02/22/2024, 10:56 PM

😂 1

Kevin Grismore

02/22/2024, 10:56 PM

the design of prefect encourages less constrained thinking when it comes to writing your workflows, because nothing strictly needs to be a DAG. but running things across many containers becomes less surface-level as a result 🤷

Baker

02/22/2024, 10:58 PM

The alternative I identified was to use tasks running in containers via coiled functions, but that adds a bunch of complexities and dependencies that my team wouldn't like I'm sure.

Kevin Grismore

02/22/2024, 10:58 PM

I was actually going to suggest exactly that

Kevin Grismore

02/22/2024, 10:59 PM

imo it's pretty awesome

Kevin Grismore

02/22/2024, 10:59 PM

but you should be able to choose

3 Views

Open in Slack

Previous Next