So I'm refactoring a Flow into a Flow of Flows. It runs a bunch of Papermill notebooks. They have to be run in a certain order - but in some cases, a dependency isn't relevant, so we skip it. This currently takes the form of a bunch of Tasks along the lines of
Copy code
if notebook_name in list_of_notebooks:
run_notebook(notebook_name)
else:
print("skipped")
How do I do that with the
StartFlowRun
task? Should I subclass it and replace the existing
.run()
method?
k
Kevin Kho
10/22/2021, 11:26 PM
Subclassing the run makes sense here. Then you can just trigger the
StartFlowRun
call if it fulfills the condition.
m
Matt Alhonte
10/22/2021, 11:29 PM
Cool, thanks! So I guess I'll move the current
.run()
fn to a different name (let's say
.execute_flow
, and then have the new
.run()
as something like
if notebook_name in list_of_notebooks: self.execute_flow()
?
k
Kevin Kho
10/23/2021, 1:48 AM
You can override the run and use super right?
The DBTShellTask is a good example of this as it just overrides the ShellTask then calls
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.