Josh Greenhalgh
01/27/2021, 5:24 PMp1_res = extract_raw_from_api.map(
asset_location=asset_locations,
extract_date=unmapped(extract_date),
extract_period_days=unmapped(extract_period_days),
)
p2_res = process_2.map(p1_res)
p3_res = process_3.map(p2_res)
Then as soon as task 1 from p1_res is done then the corresponding task should start in process_2?
As it stands no process_2 tasks start until many of the extract_raw_from_api tasks are already complete...josh
01/27/2021, 5:31 PMJosh Greenhalgh
01/27/2021, 5:31 PMjosh
01/27/2021, 5:32 PMJosh Greenhalgh
01/27/2021, 5:33 PMJosh Greenhalgh
01/27/2021, 5:33 PMjosh
01/27/2021, 5:33 PMJosh Greenhalgh
01/27/2021, 5:34 PMJosh Greenhalgh
01/27/2021, 5:34 PMJim Crist-Harif
01/27/2021, 5:35 PMDaskExecutor ) to get parallelization, but you do need to use something other than the default LocalExecutor. If your tasks are all io bound (for example, starting and monitoring external k8s jobs), you might find the LocalDaskExecutor sufficient. This uses a local thread pool to run tasks in parallel.Jim Crist-Harif
01/27/2021, 5:35 PMJosh Greenhalgh
01/27/2021, 5:36 PMJosh Greenhalgh
01/27/2021, 5:37 PMJim Crist-Harif
01/27/2021, 5:37 PMJosh Greenhalgh
01/27/2021, 5:38 PMJosh Greenhalgh
01/27/2021, 5:43 PMwith Flow(name=name, storage=storage, run_config=run_config, schedule=schedule, executor = DaskExecutor()) as flow:
And now my mapped tasks don't even start I get a pickling error about max recursion depth 😞Jim Crist-Harif
01/27/2021, 6:09 PMJim Crist-Harif
01/27/2021, 6:10 PMexecutor=LocalDaskExecutor() you won't get runtime pickling, since all your tasks run in the same process. This shouldn't be required though, the error you got above isn't expected).Josh Greenhalgh
01/27/2021, 6:13 PMJosh Greenhalgh
01/27/2021, 6:14 PM