marrrcin
08/27/2024, 5:47 AMflows
are dynamic by default => no DAGs => no way to paralleize automatically and everything needs to be explicitly paralleized.
Even if my flow
is embarrassingly parallel, nothing gets paralleized automatically.
The only way to achieve that is to use <task>.map
, right? https://docs-3.prefect.io/3.0rc/develop/task-runners#mapping-over-iterables
or am I missing something?Marvin
08/27/2024, 5:48 AMtask.map
is a way to parallelize tasks across multiple iterations of data. Example:
@task
def process_data(x):
return x * 2
@flow
def my_flow():
result = process_data.map([1, 2, 3])
(source: https://docs.prefect.io/latest/faq/)
2. Concurrency and Parallelism: Prefect supports creating concurrency limits and also has task runners specifically designed for parallelism, such as the ConcurrentTaskRunner
, DaskTaskRunner
, and RayTaskRunner
.
(source: https://docs.prefect.io/latest/concepts/task-runners/)
These features must be configured explicitly to achieve the desired parallel execution.Marvin
08/27/2024, 5:49 AMtask.map
is a common way to parallelize tasks explicitly. This allows you to run the same task across multiple inputs in parallel.
3. Task Runners: Additionally, using specific task runners like ConcurrentTaskRunner
, DaskTaskRunner
, or RayTaskRunner
can help in achieving parallel execution of tasks.
Here's an example using `task.map`:
@flow
def my_flow():
task_a.map([1, 2, 3])
For more information, you can refer to the Prefect Task Runners documentation and the Mapping over Iterables section.Marvin
08/27/2024, 5:49 AMtask.map
.
2. Example of task.map
:
@flow
def my_flow():
task_a.map([1, 2, 3])
3. Concurrency Limits: Concurrency can be managed using concurrency context managers if needed.
4. Documentation on Mapping: More details can be found here.
5. Concurrent Tasks Example: More advanced examples involving concurrency can be found here.
For a detailed code example related to mapping tasks, see this code example.