Dominic Pham
10/26/2021, 6:57 PM@task
def query_that_will_return_a_list(): -> list
@task
def scrapy_api_call_chunks(title_list):
loop_payload = prefect.context.get("task_loop_count", 0)
title_list_grouper = list(grouper(title_list, 10))
if loop_payload <= len(title_list_grouper):
# Each loop will be an iteration of 10 titles. # of loops * 10 will result in the total number of titles looped over so far
raise LOOP(message = 'Running the next 10 items in job titles list')
scraper_class = Scraper()
scraper_class.instantiate_web_scraper(title_list_grouper[loop_payload - 2])
I feel like I don't fully understand how to utilize LOOP in the context of passing information to another function inside the task.Kevin Kho
grouper
code?Kevin Kho
raise LOOP
, you pass the modified data to the next loop as the input. Have you seen the example here ?Dominic Pham
10/26/2021, 7:11 PMgrouper
is a more-itertools recipe.
def grouper(iterable, n, fillvalue=None):
"Collect data into non-overlapping fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)
Kevin Kho
raise LOOP(message=f"Fib {n}={next_fib}", result=dict(n=n + 1, fib=next_fib))
Dominic Pham
10/26/2021, 7:53 PMKevin Kho
Dominic Pham
10/26/2021, 8:20 PMKevin Kho
logger = prefect.context.get("logger")
<http://logger.info|logger.info>(your_stuff_here)
inside the task