jorwoods
02/07/2020, 9:13 PMpython flow_x.py
?
2. If I update the flow, I then need to stop the existing process running said flow, and restart it with the updated version. Is that correct? Is there a way to refresh this automatically? (Seems like this is part of prefect cloud, but wanted to check)
3. What if I have a branch in a flow where several mapped tasks share some common dependency that has costly start-up/tear-down? (E.G. Several tasks in the flow can run, then start a shared EMR cluster, use the EMR cluster, then when all are complete, tear the cluster down)Chris White
02/07/2020, 10:31 PMflow.run()
is the main entrypoint for executing your flow using Core alone. Whether you call that interactively or through a script is up to you
2.) Yes that is essentially accurate - there are no automatic refresh hooks in Core alone. You could create a custom wrapper which loads the Flow from a serialized object on every run or something like that, but we dont expose any first-class way (other than by using Cloud).
3.) We are considering implementing the concept of a “shared resource” on tasks, but it is still in the early stages. In the meantime, I’d recommend creating a task which starts the EMR cluster prior to your other tasks running and then a final task with an always_run
trigger that tears the cluster down at the end.