Thread
#prefect-community
    m

    Matthias

    2 years ago
    Hi Prefect Crew 😃 I have a
    task
    that returns a
    list
    . Using
    map
    I pass the list to another task, which per each item returns again a
    list
    . I want to
    map()
    this nested
    list
    to anoter task to iterate over all the items. Is there any intended way of doing so? Thanks!
    I now wrote a function to flatten the list of lists to a list. Probably right now the best way.
    Using a Dask Worker, is it good to map over thousands of list items? Or should I loop within a single task? Each iteration should perform a web-request
    Jeremiah

    Jeremiah

    2 years ago
    Hey @Matthias - we’re looking at implement a flat_map operator that will help with nested maps without needing to do an intermediate reduce step, but at this time you will need to collect the results of your map and re-map over the combined list (because Prefect maps are only defined as parallel pipelines for now).
    As for whether you should map or loop - it’s a subjective question along two dimensions: 1) resource availability and 2) unit of work preference. Mapping will give you greater visibility (and performance) than looping, because each item becomes its own task. However, it comes with proportionately greater resource requirements for passing the data around and executing the operation in parallel. In general, thousands of items is no issue for a sufficiently large Dask cluster, so I’d suggest mapping up until you hit a memory constraint.