Thread
#prefect-community
    Adam

    Adam

    2 months ago
    hey folks, I have a hopefully rather simple question also prefect noob. I’m trying to test the cacheing/restart features. For simplicity, I have 5 csv’s I’m downloading within 1 task. If the first csv is successful but then the task fails on the 2nd csv, I would like to restart the task where it left off and not need to re-download the 1st csv. From my understanding I need to leverage the results feature? I understand there’s a parameter for max retries but when retrying, is there any configuration needed to inform the task to start from the 2nd csv? Also if the max retries fails, I’d like the ability to re-start manually (presumably after fixing x bug that stopped the 2nd csv from downloading) again without having to re-download the 1st csv
    Kyle McChesney

    Kyle McChesney

    2 months ago
    I am not sure you will be able to communicate to prefect that part of the task was successful and part of it was not. You 100% need results to get the re-try functionality to work. You should probably also look at
    map
    , which will allow you to have 1 task per csv, and then capture the idea that some worked and some failed. Prefect should be able to recover and re-run just the failed mapped task instances.
    Andrew Huang

    Andrew Huang

    2 months ago
    I think if you have 5 csv’s you’re downloading within 1 task, you should split it into 5 separate task runs. for example defining a basic download_csv task, and call it 5 separate times. then the retries will only operate on failed downloads, here’s an example using prefect 2.0
    import pandas as pd
    from prefect import flow, task
    
    @task(retries=2, retry_delay_seconds=5)
    def download_csv(url):
        return pd.read_csv(url)
    
    @flow
    def download_flow(urls):
        dfs = [download_csv(url) for url in urls]
        return dfs
    Adam

    Adam

    2 months ago
    got it, thanks both. Andrew, if I implement it this way, do I need to save the results if I want to manually restart a failed flow?
    Andrew Huang

    Andrew Huang

    2 months ago
    you can cache the results if that’s what you mean by saving the results https://orion-docs.prefect.io/tutorials/flow-task-config/#cache-key-function
    you might be interested in this short series

    https://www.youtube.com/watch?v=hKcOdQD9eGQ&list=PLZfWmQS5hVzFmPh4hVj9ijtl-jRhsWD6E&index=3