Is there a way to download a file from S3 and save...
# prefect-community
v
Is there a way to download a file from S3 and save to a location, and skip re-downloading the file if it's already saved.
k
Hi @Varuna Bamunusinghe, I think this should be handles in your task logic. How are you downloading the files now?
v
I am using aws cli. But, I can download it using boto3. But, I can't find a way to skip the step if the file is downloaded already. I can manually check for os.exists, but I prefer to use task decorator checkpoint if possible.
I just checked the Task classes. I would be able to write a Task class for this.
k
I don’t think the checkpoint is intended for this use case. It’s to persist a task result so that it can be loaded in when you need to restart a flow run from the point of failure. This type of checking the file with
os.exists
would fit better in the task logic.
1
v
I just implemented an S3Downloader with
os.exists
. Thanks for the help.
👍 1