Thread
#prefect-community
    Tushar Kaithakkulam

    Tushar Kaithakkulam

    1 year ago
    Hi, I need some help or guidance. So right now I have a flow wherein there is a task called "process_document_page" that works on a list of pages in 1 document and processes them using the map. However, now I want to scale and process a list of documents using the same function "process_document_page", which will require a map of map, which I understand is not supported by prefect yet. So is there a solution to create a flow where I can handle this list of documents and call the "process_document_page" on each document? I have something like below, is there a workaround for it?
    @task(name="process_document")
    def process_document(document): processed_document = process_document_pages.map(document) return processed_document with Flow(name="Local Test") as flow: documents = process_document.map(documents)
    Kevin Kho

    Kevin Kho

    1 year ago
    Hi @Tushar Kaithakkulam! Will respond more tomorrow but what do you want your return type to be? List of lists? or one big list?
    Tushar Kaithakkulam

    Tushar Kaithakkulam

    1 year ago
    thanks for the reply Kevin, we would like to have a list of list.
    Kevin Kho

    Kevin Kho

    1 year ago
    Hey I don’t think you can preserve the list of list. You can use
    flatten
    and then
    map
    on the flattened list and that will parallelize it for the document pages. You would then need some other task to bring it back into a List of List format. Conceptually a map of map doesn’t provide more parallelism because the first map is already parallelizing across available resources.