Hey folks I ve a question for all the dbt data model masterm Prefect Community #prefect-dbt

Hey folks, I've a question for all the dbt data mo...

Andreas Nigg

10/31/2022, 5:03 PM

Hey folks, I've a question for all the dbt data model masterminds out there (I hope this is the right channel). I’ve a rather huge source table which is partitioned by a column “loaded_at”. I have an incremental model which reads from this source table. To exclude source table partitions to read, I could make use of the _dbt_max_partitions scripting variable - something like below

Copy code

{% if is_incremental() %}
    where loaded_at>= coalesce(_dbt_max_partition, '2022-01-01')
  {% else %}

But, the problem is, that in my incremental model I do not partition by “loaded_at” but by a different column (due to use-case demands). So _dbt_max_partition would not help here, as it would simply return the maximum partition value of the model (which I can’t use as filter for the source table). In “native” BigQuery I would simply use a scripting variable as follows

Copy code

declare max_source_partition timestamp;
set max_source_partition = (select max(loaded_at) as ts from `my_model_table`);
select * from `my_source_table` where loaded_at > max_source_partition

How can one implement such a scenario with dbt? Is there a way to create scripting variables as part of my models? Or do I need to add it as a on-start-hook? Or any better strategies to exclude partitions in my source without having the same column as partition field in my model?

Nate

11/02/2022, 9:19 PM

Hey @Andreas Nigg 👋 You might have more luck with this question in the dbt community slack given that this a pretty in-depth dbt-specific question When you're ready to orchestrate that dbt with prefect, this channel is a great place to ask questions about that!

👍 1

gratitude thank you 1

5 Views

Open in Slack

Previous Next