https://prefect.io logo
Title
c

Chu

07/19/2022, 7:35 PM
Hi community, how is the best practice for implementing parallel dbt jobs using Prefect 1.0? Basically, I need to send different client_id to dbt jobs and trigger dbt run for each client_id (pseudo code as following)
with Flow as flow:
  for i in id_list:
    dbt_run_function(i)
(I’m wondering if a simple for loop would achieve parallelism?)
k

Kevin Kho

07/19/2022, 7:36 PM
c

Chu

07/19/2022, 7:38 PM
If we dont have a DaskExecutor, will mapping be achieved? We are using snowflake for data warehousing
I’m wondering if the mapping is the same as for loop? (no reduce needed)
k

Kevin Kho

07/19/2022, 9:22 PM
Mapping will just run sequentially if you have no parallelism
But yes it is the DAG equivalent of the for loop
c

Chu

07/19/2022, 10:14 PM
To ensure parallelism, i need to add daskexecutor for example right?
k

Kevin Kho

07/19/2022, 10:16 PM
or LocalDask, yep
c

Chu

07/19/2022, 10:24 PM
Thank you Kevin! I haven't use dask executor before, will that cause some fees or require some computing infra? Or i can just import it in python and it will function
k

Kevin Kho

07/19/2022, 11:01 PM
No fees because we bill per task anyway. You can just use it
1