Hi there everyone, I’m trying to understand how to...
# prefect-community
j
Hi there everyone, I’m trying to understand how to do simple map tasks with prefect2
Copy code
@task
def generate_numbers():
    return [1, 2, 3, 4]

@task 
def compute_sth_expensive(number):
    return number ** 2

@flow
def pipeline():
    result_generate_numbers = generate_numbers()
    results = map(compute_sth_expensive, result_generate_numbers)
    for r in results: r.result() ## ??
Is that an acceptable pattern? I want to do a parallel calculation over the
result_generate_numbers
and then perform some gather-like operation
a
Hi Jan! This page provides an example
Mapping is on the roadmap - for now, you can solve this by using a for-loop and attaching a Dask, Ray or concurrent task runner to your flow
j
Thank you for your answer! But doesn’t this suffer from the performance problem described here? https://github.com/PrefectHQ/prefect/issues/5653
a
This issue is only to investigate potential extra performance improvements, the logic itself and parallel execution are working fine. Can you say more about the problem you are trying to solve? for many IO-based use cases such as talking to external APIs, DBs and processing files, the default concurrent task runner + a for-loop may be all you need to run things fast enough without the overhead of Dask or Ray
j
Perfect, this is super helpful, thank you
👍 1
Yeah, it’s about doing an inexpensive map calculation on a huge list of objects (1e8) but we’re seeing the problem with even 1e2-1e3 objects, so we’ll consider chunking to improve the situation. The looping indeed works for us.
Is the map and for loop approach equivalent? It should be right?
What’s the recommended way to aggregate multiple results in a subsequent task? wait_for?
a
roughly yes, mapping has some nuances but it should have the same effect and yes, to retrieve the results, you'd need to do
.result()
j
Okay, thank you
👍 1