Hi all! I have a task that calls a SQL query and r...
# ask-community
d
Hi all! I have a task that calls a SQL query and returns a list, and I want to iterate over that list and pass even chunks of the list to a scrapy spider that is called in another task and wait between each chunk. However, I'm running into an issue with LOOP where it is only passing the last chunk of the list to the scrapy spider.
Copy code
@task
def query_that_will_return_a_list(): -> list

@task
def scrapy_api_call_chunks(title_list):
    loop_payload = prefect.context.get("task_loop_count", 0)
    title_list_grouper = list(grouper(title_list, 10))
    if loop_payload <= len(title_list_grouper):
        # Each loop will be an iteration of 10 titles. # of loops * 10 will result in the total number of titles looped over so far
        raise LOOP(message = 'Running the next 10 items in job titles list')
    scraper_class = Scraper()
    scraper_class.instantiate_web_scraper(title_list_grouper[loop_payload - 2])
I feel like I don't fully understand how to utilize LOOP in the context of passing information to another function inside the task.
k
Can I see your
grouper
code?
So the way the loop works, when you
raise LOOP
, you pass the modified data to the next loop as the input. Have you seen the example here ?
d
Hi Kevin,
grouper
is a more-itertools recipe.
Copy code
def grouper(iterable, n, fillvalue=None):
    "Collect data into non-overlapping fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)
k
Ah ok. I think the issue is indeed just understanding the LOOP more. You need to pass the result like the example
Copy code
raise LOOP(message=f"Fib {n}={next_fib}", result=dict(n=n + 1, fib=next_fib))
d
Ah okay, I think I understand. Is there a way to see what is being passed between each loop?
k
You log it inside the task so it gets printed in the logs every loop
d
Apologies, is there an example of what that might look like involving a LOOP? I've looked over the documentation on Logging but I can't quite wrap my head around it
k
No worries. It would be:
Copy code
logger = prefect.context.get("logger")
<http://logger.info|logger.info>(your_stuff_here)
inside the task