Scott Moreland11/23/2020, 12:58 PM
. Are there any references for this?
Kyle Moon-Wright11/23/2020, 4:47 PM
Scott Moreland11/23/2020, 5:01 PM
Kyle Moon-Wright11/23/2020, 6:12 PM
Scott Moreland11/23/2020, 6:22 PM
Kyle Moon-Wright11/23/2020, 6:52 PM
as both being a part of the write part of the Result, not the read (
only checks for existence). You may need to customize a task/logic to check for your table’s existence before doing your Result write - I can’t think of a way to do this cleanly otherwise, but I will continue to think about it.
Scott Moreland11/23/2020, 7:06 PM
Dealing with something like this and trying to avoid duplicating the persistent table name in both location and target. Also wondering if I really need to manually specify
sql_context = create_sql_context() db_result = HiveResult(sql_context, location='task_output_table_name') @task(target='task_output_table_name', result=db_result) def create_table(sql_context): """Transform a table""" sdf = sql_context.read.table('database.src_table_name') sdf = sdf.groupby('col1').agg(sf.sum('col2').alias('sum')) db_result.write(sdf) return sdf
as it sounds like this should happen automatically when the target doesn't exist and needs to be rebuilt.
Kyle Moon-Wright11/23/2020, 7:18 PM
method on your
Scott Moreland11/23/2020, 7:20 PM
Kyle Moon-Wright11/23/2020, 7:28 PM
task runs and sees that the target exists?
Scott Moreland11/23/2020, 9:35 PM
to True via the associated environment variable. Thanks again for the help!
Kyle Moon-Wright11/23/2020, 9:40 PM