Mike Marinaccio

    Mike Marinaccio

    1 year ago
    Hi everyone! I’m exploring ideas around dynamically configuring a Fargate instance/cluster based on a flow parameter / input. In short, I have an hourly job that migrates data for a group of clients based on their timezone. For some hours of the day, there will be more clients and thus more work. Ideally, I want my flow to scale resources based on the number of clients queried for a timezone. Does anyone have ideas for a way to do this? Since flow environments are set at registration time, I’m struggling to find a feasible approach. I’ve also started to explore the new
    ECR Task
    and
    run_config
    recently added, which sound like a potential solution. Thanks for the input!
    j

    Joe Schmid

    1 year ago
    @Mike Marinaccio We do exactly this using DaskCloudProviderEnvironment and the
    on_execute()
    callback. See this example in the docs: https://docs.prefect.io/orchestration/execution/dask_cloud_provider_environment.html#advanced-example-dynamic-worker-sizing-from-parameters-tls-encryption IMPORTANT NOTE: I believe the Prefect team is moving away from this environment and instead recommending use of a 
    LocalEnvironment
     with a 
    DaskExecutor
     , e.g.
    DaskExecutor(
        cluster_class="dask_cloudprovider.FargateCluster", ...
    )
    I haven't had a chance yet to look at implementing the same "dynamically set Dask worker sizing based on Flow parameters" idea using that approach.
    Mike Marinaccio

    Mike Marinaccio

    1 year ago
    Oh this is perfect. Thanks @Joe Schmid!