Is there a way to download a file from S3 and save to a location, and skip re-downloading the file if it's already saved.
01/20/2022, 6:21 AM
Hi @Varuna Bamunusinghe, I think this should be handles in your task logic. How are you downloading the files now?
01/20/2022, 6:31 AM
I am using aws cli. But, I can download it using boto3. But, I can't find a way to skip the step if the file is downloaded already. I can manually check for os.exists, but I prefer to use task decorator checkpoint if possible.
I just checked the Task classes. I would be able to write a Task class for this.
01/20/2022, 6:50 AM
I don’t think the checkpoint is intended for this use case. It’s to persist a task result so that it can be loaded in when you need to restart a flow run from the point of failure. This type of checking the file with