Gustavo Puma
04/08/2022, 5:50 PMwith Flow("application-etl") as flow:
conn = PrefectSecret("DATABRICKS_CONNECTION_STRING_PRE")
bronze = DatabricksRunNow(job_id=BRONZE_APPLICATION_JOB_ID, name="bronze")
silver = DatabricksRunNow(job_id=SILVER_APPLICATION_JOB_ID, name="silver")
silver.set_upstream(bronze)
gold = DatabricksRunNow(job_id=GOLD_APPLICATION_JOB_ID, name="gold")
gold.set_upstream(silver)
bronze(databricks_conn_secret=conn)
silver(databricks_conn_secret=conn)
gold(databricks_conn_secret=conn)
flow.run()
I want to run my tasks in the sequence bronze > silver > gold
. While this executes gold or silver are being started before bronze. I don't know if I'm misunderstanding how set_upstream
works 🤔 Thanks in advanceKevin Kho
04/08/2022, 5:53 PMDatabricksRunNow()
is an init, but you want it on the run
with Flow("application-etl") as flow:
conn = PrefectSecret("DATABRICKS_CONNECTION_STRING_PRE")
bronze = DatabricksRunNow(job_id=BRONZE_APPLICATION_JOB_ID, name="bronze")
silver = DatabricksRunNow(job_id=SILVER_APPLICATION_JOB_ID, name="silver")
gold = DatabricksRunNow(job_id=GOLD_APPLICATION_JOB_ID, name="gold")
a = bronze(databricks_conn_secret=conn)
b = silver(databricks_conn_secret=conn)
c = gold(databricks_conn_secret=conn)
b.set_upstream(a)
c.set_upstream(b)
flow.run()
and the init doesn’t have to be in the flow
bronze = DatabricksRunNow(job_id=BRONZE_APPLICATION_JOB_ID, name="bronze")
silver = DatabricksRunNow(job_id=SILVER_APPLICATION_JOB_ID, name="silver")
gold = DatabricksRunNow(job_id=GOLD_APPLICATION_JOB_ID, name="gold")
with Flow("application-etl") as flow:
conn = PrefectSecret("DATABRICKS_CONNECTION_STRING_PRE")
a = bronze(databricks_conn_secret=conn)
b = silver(databricks_conn_secret=conn)
c = gold(databricks_conn_secret=conn)
b.set_upstream(a)
c.set_upstream(b)
flow.run()
Gustavo Puma
04/11/2022, 10:46 AM