https://prefect.io logo
l

Laura Lorenz (she/her)

06/22/2020, 6:20 PM
Just wanted to give a 👏 shout out 👏 to @Mark McDonald who made a DbtShellTask a while back in the task library and with his help Prefect now has a section in the dbt docs here: https://docs.getdbt.com/docs/running-a-dbt-project/running-dbt-in-production/#using-prefect. Hoping to see more dbt users in the Prefect community now marvin
👏 13
🙌 4
🚀 2
💯 1
r

Robin

07/12/2020, 9:44 PM
@Mark McDonald, we are new to
dbt
and
prefect
. How can one exactly use the
DbtShellTask
to run dbt models? Where would the dbt code lie?
1
m

Mark McDonald

07/12/2020, 9:53 PM
the dbt project (or any helper files for that matter) can be deployed along side with the prefect python flow code in the Docker image. So, assuming you're using Docker storage, there is an option to copy over files. I add my dbt project files through this https://github.com/PrefectHQ/prefect/blob/0b3ef62bc99c5901b1fb9ae77142f7e5c5c74187/src/prefect/environments/storage/docker.py#L57-L58
🚀 1
r

Robin

07/12/2020, 10:51 PM
Wow, thanks for the immediate answer! Is it a good practice to put all code dependencies in the docker image of prefect and then execute it like you described? Or is it rather an exception for dbt?
m

Mark McDonald

07/13/2020, 2:06 PM
hey @Robin, yes it's good practice to have the dbt code in the docker image to execute it as part of the flow. If you don't do it this way, I guess you could have a task that pulls down the dbt code from another location (like s3) at the start of the flow, but I don't see much benefit in doing that.
r

Robin

07/13/2020, 4:22 PM
Alright, thanks a lot! 🙂
👍 1
Hey @Mark McDonald, two follow-up questions (hope it’s fine to ask here): 1. If I understood correctly, I have to add the files to the docker not using one dict with the sourcepath and the targetpath within the docker, but with all sourcefiles in the dbt folder and all respective targetpaths in the docker folder. Is that correct? 2. If that’s correct, how do you generate this dict? 3. Why do we have to provide the
profiles_dir
? Is it to create the
profiles.yml
file in the docker container? Once I have fully understood (and run) the
DbtShellTask
I could add some further description and a small example from a beginner’s perspective to the prefect documentation. Code:
Copy code
import prefect
from prefect import Flow, task
from prefect.environments.storage import Docker
from prefect.tasks.aws import AWSSecretsManager
from prefect.tasks.dbt import DbtShellTask

PROFILES = "."  # "~/.dbt/"
secret_name = "sf_credentials"

with Flow("dbt_flow") as flow:

    s = AWSSecretsManager(secret_name)

    task = DbtShellTask(
        environment="Development",
        dbt_kwargs={
            "type": "snowflake",
            "threads": 1,
            "account": s["sf_account"],
            "user": s["sf_user"],
            "password": s["sf_password"],
        },
        profiles_dir=PROFILES,
    )(command="dbt run")

# flow.register(project_name="eks_test_01")

flow.storage = Docker(files={"/Users/robinbeer/dev/code/accure-etl": "dbt"})
m

Mark McDonald

07/23/2020, 2:26 PM
Hi @Robin - I hope these answers help. 1. There are ways to copy the files into docker image without using this feature within Prefect's Docker storage, so don't feel beholden to using this. I think the dict is a key pair for each file you want to copy over. If you have a huge dbt project, this would be a potentially large dict. 2. You could iterate through the directory where your dbt files live and build the dict that way. Maybe something like the following:
Copy code
helper_files = dict()
helper_directory = Path("src/dbt")
for filename in os.listdir(helper_directory):
      source_file_path = os.path.join(os.getcwd(), "src/dbt", filename)
      dest_file_path = os.path.join("src/dbt", filename)
      helper_files[source_file_path] = dest_file_path

storage = Docker(
        ...
        files=helper_files,
        env_vars=envars,
    )
3. you need to provide a profiles_dir to let dbt know where your profiles.yml file lives. Perhaps profiles.yml located in the same directory as where your flow files are located, but also very possible it's not.
💯 1
r

Robin

07/28/2020, 11:58 AM
OK, got it! I am still having trouble with the
profiles_dir
parameter, basically with similar questions as mentioned in this PR.
m

Mark McDonald

07/29/2020, 4:25 PM
@Robin sorry for the delay in my response. I just saw this message on the thread. I will try to add that default in or make that parameter required. I should be able to get to it tonight. I have to think it over a little bit more. Is the task working for you otherwise?
r

Robin

07/29/2020, 4:30 PM
Thanks a lot!
5 Views