https://prefect.io logo
Title
i

itay livni

06/04/2020, 5:06 PM
Hi - I'd like some opinions and direction on where methods to join multiple flows should be written. https://github.com/PrefectHQ/prefect/pull/2719 . file:///home/ilivni/Downloads/disjoint_union.png @Laura Lorenz (she/her) summed up everything nicely here. https://github.com/PrefectHQ/prefect/issues/1745#issuecomment-637698235 Currently
Flow
has one method,
update
. My concern to continue this approach is twofold: (1) Adding more code to a core piece of prefect and thus adding complexity -- That are not really essential to
Flow
(2) The idea of doing direct operations on a
Flow
can lead to undesirable effects (???). On the flip side is it's convenient:
node_resources_fl.disjoint(fl)
flow_lst = [get_secondary_sources_fl, get_primary_sources_fl, choose_defintions_fl]

# Test disjoint
node_resources_fl = Flow("node_resources")
for fl in flow_lst:
    node_resources_fl.disjoint(fl)
node_resources_fl.visualize()
The other way would be something along the lines of a module in
core.utilities
. And would be structured like networkx operators.
from prefect.utilities.flow_operators import disjoint_union
new_flow = disjoint_union(get_primary_sources_fl, get_secondary_sources_fl)
c

Chris White

06/04/2020, 5:11 PM
I have a few thoughts: - i don’t see how
.disjoint
is different from
.update
when the flows don’t share any tasks? Prefect can’t force separate two identical task classes so I don’t see what additional work could be achieved by a
.disjoint
method - I wonder if we should create a
contrib
submodule of
utilities
, and then both the complexity and maintenance burden are greatly reduced and it’s more of “This is a pattern a user found useful”
i

itay livni

06/04/2020, 5:27 PM
@Chris White "i don’t see how 
.disjoint
 is different from 
.update
 when the flows don’t share any tasks?  Prefect can’t force separate two identical task classes so I don’t see what additional work could be achieved by a 
.disjoint
 method" That is correct, there is no difference. However prefect does not have a way of handling non unique parameters or, (as my understanding goes) a way to differentiate
targets
with similarly named
tasks
. "`contrib` submodule of 
utilities
, and then both the complexity and maintenance burden are greatly reduced" I like this idea -- Its also a lightweight way to test if an idea is useful to the broader community.
c

Chris White

06/04/2020, 5:34 PM
For non-unique parameters specifically I think we could special case that in the
update
call. For targets / locations of similarly named tasks i recommend using either
{task_run_id}
(which we will begin supporting in Core with the next release!) or
{task_slug}
which is unique to each task
i

itay livni

06/04/2020, 5:43 PM
"we could special case that in the 
update
" 🙏 As for
{task_run_id}
and slug. I have reservations (for my case) when they are not human readable. But no matter. What I am really going for with these methods is to merge
parameters
and other task relationship building between flows... . (1) Write many flows (2) Test (3) Join (4) Deploy as one flow
c

Chris White

06/04/2020, 5:44 PM
ah i have good news for you! task slugs will be human readable very very soon!
👏 1
i’ll make sure to ping you when it’s ready for experimenting
i

itay livni

06/04/2020, 5:48 PM
lol
@Chris White Is there a issue/pr for human readable slugs? I'd like to learn more. Coming across a case where I can remove some tags and perhaps call the slug instead for flows with duplicate task names.
c

Chris White

06/09/2020, 7:53 PM
Yup yup - this PR will be the place this gets implemented: https://github.com/PrefectHQ/prefect/pull/2531
👍 1