https://prefect.io logo
#data-tricks-and-tips
Title
# data-tricks-and-tips
a

Andreas Nigg

10/28/2022, 1:20 PM
Hey guys, I've a question for all the dbt data model masterminds out there (I hope this is the right channel). I’ve a rather huge source table which is partitioned by a column “loaded_at”. I have an incremental model which reads from this source table. To exclude source table partitions to read, I could make use of the _dbt_max_partitions scripting variable - something like below
Copy code
{% if is_incremental() %}
    where loaded_at>= coalesce(_dbt_max_partition, '2022-01-01')
  {% else %}
But, the problem is, that in my incremental model I do not partition by “loaded_at” but by a different column (due to use-case demands). So _dbt_max_partition would not help here, as it would simply return the maximum partition value of the model (which I can’t use as filter for the source table). In “native” BigQuery I would simply use a scripting variable as follows
Copy code
declare max_source_partition timestamp;
set max_source_partition = (select max(loaded_at) as ts from `my_model_table`);
select * from `my_source_table` where loaded_at > max_source_partition
How can one implement such a scenario with dbt? Is there a way to create scripting variables as part of my models? Or do I need to add it as a on-start-hook? Or any better strategies to exclude partitions in my source without having the same column as partition field in my model?
k

Kalise Richmond

10/28/2022, 9:40 PM
Hey @Andreas Nigg, you should ask in our #prefect-dbt channel
blob attention gif 1
3 Views