Thread
#prefect-community
    Varuna Bamunusinghe

    Varuna Bamunusinghe

    8 months ago
    Is there a way to download a file from S3 and save to a location, and skip re-downloading the file if it's already saved.
    Kevin Kho

    Kevin Kho

    8 months ago
    Hi @Varuna Bamunusinghe, I think this should be handles in your task logic. How are you downloading the files now?
    Varuna Bamunusinghe

    Varuna Bamunusinghe

    8 months ago
    I am using aws cli. But, I can download it using boto3. But, I can't find a way to skip the step if the file is downloaded already. I can manually check for os.exists, but I prefer to use task decorator checkpoint if possible.
    I just checked the Task classes. I would be able to write a Task class for this.
    Kevin Kho

    Kevin Kho

    8 months ago
    I don’t think the checkpoint is intended for this use case. It’s to persist a task result so that it can be loaded in when you need to restart a flow run from the point of failure. This type of checking the file with
    os.exists
    would fit better in the task logic.
    Varuna Bamunusinghe

    Varuna Bamunusinghe

    8 months ago
    I just implemented an S3Downloader with
    os.exists
    . Thanks for the help.