Ben Fogelson
07/17/2020, 6:07 PMprefect
for something like this. The idea is that currently Flow
objects have two major responsibilities: (1) building/maintaining a DAG and (2) supporting execution of that DAG. I propose splitting that functionality into two classes: Flow
, which would support DAG execution, and Composite
(or some other name), which builds and maintains a DAG. Crucially, Composite
instances could be nested (i.e. Composite.add_task
would accept either a Task
or another Composite
instance, and similarly for replace
, add_edge
, etc). To run the DAG, Flow
would traverse the nested graph and build up a flattened DAG that just contains Task
nodes, and then execute as usual.nicholas
07/17/2020, 6:16 PMJeremiah
07/17/2020, 7:03 PMFlows
are used to build the DAG; and FlowRunners
are used to execute it.Ben Fogelson
07/17/2020, 7:10 PMFlows
build the DAG and also have information about how to build and register it so it is ready to be run? E.g. environment
, storage
, register()
Jeremiah
07/17/2020, 7:10 PMBen Fogelson
07/17/2020, 7:11 PMJeremiah
07/17/2020, 7:11 PMComposite
class completely independent of Prefect, as a way of generating your final flow.Ben Fogelson
07/17/2020, 7:12 PMJeremiah
07/17/2020, 7:12 PMtags
) because a major part of this is the mental burden of keeping trac of groups of tasks - look for that soonBen Fogelson
07/17/2020, 7:14 PMFlow
in a modular fashion. I totally get how factory functions are fantastic for modularization in the Flow
construction process, but for modification something like a nested structure still feels useful. Would love to hear thoughts on other ways to do it thoughComposite
be separate could workFlow
itay livni
07/17/2020, 7:31 PMflow_update(merge_parameters=True
3. Run a remove_middle_parameters
script
Steps 2 and 3 are combined in build_flow.py
See gist for more details...
https://gist.github.com/gryBox/da862d89ad0df49c02cc9b86e334b23c
Let me know if you need help understanding it. Can certainly be improved. Currently using this on 4 flows together.Jeremiah
07/17/2020, 7:43 PMFlow
object doesn’t really contemplate this possibility).
I think this would be an appropriate place to build your own “Flow Factory” (whether a function, a class, a DSL etc.) that allows you to put all the building blocks together however you want to produce the “final” flow you want. Sort of one step higher than the Prefect APIitay livni
07/17/2020, 7:50 PMJeremiah
07/17/2020, 7:51 PMMatt Wong-Kemp
07/17/2020, 8:16 PMproject_name
and flow_name
of the 'plug-in' flow parameters on the outer flow.itay livni
07/17/2020, 8:24 PMMatt Wong-Kemp
07/17/2020, 8:25 PMFlowRunTask(wait=True)
for subflow-running - here's a recursive flow that calls itself to calculate the fibonacci series value for a given `n`:
https://gist.github.com/emcake/032cef5147c468f0074c2cfe7d1703c4
The first caveat is this is obviously slightly mad. I'm also spotting something slightly weird behaviour where I can't get it to recurse properly on a local agent due to some weird parallelism thing, but I've definitely seen this work on previous versions.
The second caveat is that it doesn't calculate the fibonacci series! @Chris White your recent change to use an idempotency key in FlowRunTask
of the run id means that, rather than calling the flow again for n-1
and n-2
, it calls it for whichever one wins and then the loser just gets the winner's value.
Is there any chance we can make the idempotency key a parameter into the FlowRunTask
that uses the run id as the default? This way I could come up with a context-specific run id, like fib-{n}
that'd even get me memoizing fibonacci runs. :)Chris White
07/18/2020, 12:08 AMMatt Wong-Kemp
07/18/2020, 12:14 AMJeremiah
07/18/2020, 12:15 AMMatt Wong-Kemp
07/18/2020, 12:16 AMChris White
07/18/2020, 12:21 AMJeremiah
07/18/2020, 12:22 AMthe deeper learning here is that flows are just functionsThis is a really insightful point @Matt Wong-Kemp
Matt Wong-Kemp
07/18/2020, 12:24 AMJeremiah
07/18/2020, 12:25 AMSuccess
state of a flow could just reflect its reference tasksMatt Wong-Kemp
07/18/2020, 12:27 AMthe deeper learning here is that flows are just functions🚨 nerd alert 🚨 here but technically flows are members of the same selective-applicative-functor typeclass as tasks, since they also carry state and all of the flow control options that tasks do :)
Chris White
07/18/2020, 12:28 AMitay livni
07/18/2020, 12:33 AMMatt Wong-Kemp
07/18/2020, 12:41 AMx
, output result in task with slug result
) then you can treat the inner part as a black box you don't really care about.