Hi all, is there any concept of stopping a flow after a certain task to examine outputs + proceed with user selected results as parameters for a downstream task? Example that is coming to mind is result of say a grid search
k
Kevin Kho
07/19/2021, 7:19 PM
Hey @Leon Kozlowski, so we have a PAUSE flag, that can be used for manual intervention. In this case, I think you can combine the PAUSE, and and start another Flow run with the updated grid for the grid search. Is the use case here you want it all to be in contained one Flow?
l
Leon Kozlowski
07/19/2021, 7:26 PM
Doesn’t have to be just one flow, but does the PAUSE only allow for approving/denying or can there be some sort of parametrization for the next flow to be kicked off
k
Kevin Kho
07/19/2021, 7:29 PM
The PAUSE only allows for approving/denying. The parameterization would have to happen separately.
Kevin Kho
07/19/2021, 7:29 PM
Is your grid search parallelized?
l
Leon Kozlowski
07/19/2021, 7:34 PM
I haven’t implemented anything yet, just gathering input from the team on potential usecases
k
Kevin Kho
07/19/2021, 7:45 PM
Ah ok.I’m just wondering so my last job we had a parallelized grid search that ran daily. We would then review the metrics and plots and then decide how to alter the grid search. It wasn’t quite “mid-run” though.
There is a use case though for Bayesian optimization type approaches that are iterative and build on top of the last result and go on indefinitely. I suppose you could PAUSE, and see if the model training converged, and then resume if not. But with that, I’m not sure the parameters would be changed mid-run and you’d have to start a new run.
I think as long as the experiment results are persisted somewhere like MLFlow and you have history, creating new Prefect flows will work.
👍 1
l
Leon Kozlowski
07/19/2021, 7:51 PM
Gotcha, I’m just scouting ways I can further enhance some of our ml workflows, appreciate the info Kevin!
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.