Edison A
10/30/2020, 5:34 PMwith Flow("epex_scraping", schedule=schedule) as flow:
"""Main definition of all Data pipeline steps"""
report_names = scrape_for_file_names()
for report_name in report_names:
# extract
report_xml = get_xml_files(report_name)
report_json = get_xml_jsons(report_xml)
# transform
public_trades_collection = generate_public_trades(report_json)
# load
write_to_public_trades_db(public_trades_collection)
flow.register('project_x')
flow.run()
Dylan
report_names
is not a list, it’s a `Task`’s result.list
becomes available at runtimeEdison A
10/31/2020, 8:43 AM