Brett Naul
10/09/2020, 6:57 PMJim Crist-Harif
10/09/2020, 6:59 PMBrett Naul
10/09/2020, 7:03 PMdistributed.Client(executor.address).profile()
and save to HTMLJim Crist-Harif
10/09/2020, 7:05 PMDaskExecutor
in an on_exit
callback.Brett Naul
10/09/2020, 7:05 PMJim Crist-Harif
10/09/2020, 7:05 PMJoe Schmid
10/09/2020, 7:22 PMon_execute
callback in DaskCloudProviderEnvironment
to size the number of Dask workers based on Flow Parameters. Effectively something like "oh, you have a list of 12 models to train, let's get you 12 workers for that."Jim Crist-Harif
10/09/2020, 7:23 PMon_execute
callback is necessary with the new scheme. We may want to add something to make things more composable, but for now I'd like to avoid it. To get the behavior you'd want, you can write a function to create your cluster object and pass it to cluster_class
. When that function gets called the parameters (and anything else provided as context
) will already be in context
, so you'd have full access to configure the cluster however you wanted.LocalEnvironment
/`FargateTaskEnvironment` / KubernetesJobEnvironment
only), or the new KubernetesRun
run config.cluster_class
as a function, but maybe we'd want a clearer way?
• Dynamically configure the cluster scale/adapt. Currently either you'd have to do that in cluster_class
(create the cluster and call `scale`/`adapt` before returning), or use static kwargs in adapt_kwargs
.
• Do something before the cluster shuts down (@Brett Naul's case of saving the job profile)
• Do something completely unrelated to the executor before the job stopsDaskExecutor
(names subject to change):
• on_start(executor) -> None
, first thing called in executor.start()
.
• on_cluster_start(executor) -> None
, called after the cluster starts but before the flow is run. Could do scaling here if needed
• on_flow_run_stop(executor) -> None
, called after the flow run has completed, but before shutting the cluster down. Could save the profile here.
• on_stop(executor) -> None
, last thing called before exiting executor.start()
.
We could merge last two together into a single on_stop
if we don't care about having a callback after the cluster has stopped (would happen before cluster shutdown but after flow execution). Likewise, we could merge the first two if we don't care about having a callback before the cluster starts.
I'd like to minimize the number of configurables if possible, so minimizing possible callbacks would be nice unless they're all needed. Y'all have more experience actually running prefect than I do, so your thoughts would be useful here :).