Sean Harkins

    Sean Harkins

    1 year ago
    We are experiencing extremely slow task submission via the
    DaskExecutor
    for very large mapped tasks. With previous flow tests where a task was mapped over roughly 20K items, task submission was sufficiently fast that our Dask cluster scaled workers up to the worker limit. But with a task mapped over 400K items, the
    DaskExecutor
    task submission to the scheduler appears rate limited and there are never sufficient tasks on the scheduler it to create more workers and scale so we are stuck with the cluster crawling along with the minimum number of workers.
    Here is an example of a large mapped task
    And note the relatively small number of task which the scheduler has received. Normally the number of
    cache_inputs
    tasks should be growing very rapidly and the workers should be saturated forcing the cluster to scale but as you can see in the dashboard image below, the task submission to the scheduler is slow for some reason
    Andrew Black

    Andrew Black

    1 year ago
    Hi Sean, were you able to get an answer to resolve this?
    Sean Harkins

    Sean Harkins

    1 year ago
    Hi Andrew. Apologies I intended to post this in #prefect-community but hit random instead. @Kevin Kho from Prefect is investigating this behavior. You can see this tracking issue for more info https://github.com/pangeo-forge/pangeo-forge-recipes/issues/208