jorwoods
02/07/2020, 9:13 PMpython flow_x.py
?
2. If I update the flow, I then need to stop the existing process running said flow, and restart it with the updated version. Is that correct? Is there a way to refresh this automatically? (Seems like this is part of prefect cloud, but wanted to check)
3. What if I have a branch in a flow where several mapped tasks share some common dependency that has costly start-up/tear-down? (E.G. Several tasks in the flow can run, then start a shared EMR cluster, use the EMR cluster, then when all are complete, tear the cluster down)Chris White
flow.run()
is the main entrypoint for executing your flow using Core alone. Whether you call that interactively or through a script is up to you
2.) Yes that is essentially accurate - there are no automatic refresh hooks in Core alone. You could create a custom wrapper which loads the Flow from a serialized object on every run or something like that, but we dont expose any first-class way (other than by using Cloud).
3.) We are considering implementing the concept of a “shared resource” on tasks, but it is still in the early stages. In the meantime, I’d recommend creating a task which starts the EMR cluster prior to your other tasks running and then a final task with an always_run
trigger that tears the cluster down at the end.