Seth Goodman

08/05/2022, 4:31 PM
Hi All - I have a question about the expected behavior of flows when using task mapping. When I use "apply_map" to map ~1500 tasks what I see in the UI is only a set of 8 "Constant[list][x]" tasks as in the screenshot below. I originally thought perhaps this was somehow tied to use of DaskExecutor but the number of "Constant" tasks is not the same as the number of dask workers I have running. In addition, each Constant task shows the same x/1500 task running, rather than a subset. Based on runtime, it seems plausible that each Constant task is processing all the data. Still doing testing to confirm this but it felt like I was doing something wrong that would be clear to more experienced users. I've included some simple code to represent my implementation below as well. Thanks for the help!
def actual_task(arg1, arg2, arg3):
   #does stuff

task_list = [
    (1, "a", "b"),
    (2, "c", "d"),
    (3, "e", "f"),
    (4, "g", "h"),

def task_map(task):
    return actual_task(task[0], task[1], task[2])

with Flow("my_flow") as flow:
    task_results = apply_map(task_map, task_list)