Hi I d like some opinions and direction on where methods to Prefect Community #prefect-contributors-archived

Hi - I'd like some opinions and direction on where...

itay livni

06/04/2020, 5:06 PM

Hi - I'd like some opinions and direction on where methods to join multiple flows should be written. https://github.com/PrefectHQ/prefect/pull/2719 . file:///home/ilivni/Downloads/disjoint_union.png @Laura Lorenz (she/her) summed up everything nicely here. https://github.com/PrefectHQ/prefect/issues/1745#issuecomment-637698235 Currently

Flow

has one method,

update

. My concern to continue this approach is twofold: (1) Adding more code to a core piece of prefect and thus adding complexity -- That are not really essential to

Flow

(2) The idea of doing direct operations on a

Flow

can lead to undesirable effects (???). On the flip side is it's convenient:

node_resources_fl.disjoint(fl)

Copy code

flow_lst = [get_secondary_sources_fl, get_primary_sources_fl, choose_defintions_fl]

# Test disjoint
node_resources_fl = Flow("node_resources")
for fl in flow_lst:
    node_resources_fl.disjoint(fl)
node_resources_fl.visualize()

The other way would be something along the lines of a module in

core.utilities

. And would be structured like networkx operators.

Copy code

from prefect.utilities.flow_operators import disjoint_union
new_flow = disjoint_union(get_primary_sources_fl, get_secondary_sources_fl)

Chris White

06/04/2020, 5:11 PM

I have a few thoughts: - i don’t see how

.disjoint

is different from

.update

when the flows don’t share any tasks? Prefect can’t force separate two identical task classes so I don’t see what additional work could be achieved by a

.disjoint

method - I wonder if we should create a

contrib

submodule of

utilities

, and then both the complexity and maintenance burden are greatly reduced and it’s more of “This is a pattern a user found useful”

itay livni

06/04/2020, 5:27 PM

@Chris White "i don’t see how

.disjoint

is different from

.update

when the flows don’t share any tasks? Prefect can’t force separate two identical task classes so I don’t see what additional work could be achieved by a

.disjoint

method" That is correct, there is no difference. However prefect does not have a way of handling non unique parameters or, (as my understanding goes) a way to differentiate

targets

with similarly named

tasks

. "`contrib` submodule of

utilities

, and then both the complexity and maintenance burden are greatly reduced" I like this idea -- Its also a lightweight way to test if an idea is useful to the broader community.

Chris White

06/04/2020, 5:34 PM

For non-unique parameters specifically I think we could special case that in the

update

call. For targets / locations of similarly named tasks i recommend using either

{task_run_id}

(which we will begin supporting in Core with the next release!) or

{task_slug}

which is unique to each task

itay livni

06/04/2020, 5:43 PM

"we could special case that in the

update

" 🙏 As for

{task_run_id}

and slug. I have reservations (for my case) when they are not human readable. But no matter. What I am really going for with these methods is to merge

parameters

and other task relationship building between flows... . (1) Write many flows (2) Test (3) Join (4) Deploy as one flow

Chris White

06/04/2020, 5:44 PM

ah i have good news for you! task slugs will be human readable very very soon!

👏 1

Chris White

06/04/2020, 5:48 PM

i’ll make sure to ping you when it’s ready for experimenting

itay livni

06/04/2020, 5:48 PM

lol

itay livni

06/09/2020, 7:37 PM

@Chris White Is there a issue/pr for human readable slugs? I'd like to learn more. Coming across a case where I can remove some tags and perhaps call the slug instead for flows with duplicate task names.

Chris White

06/09/2020, 7:53 PM

Yup yup - this PR will be the place this gets implemented: https://github.com/PrefectHQ/prefect/pull/2531

👍 1

4 Views

Open in Slack

Previous Next