I’m using S3 storage for flows running with an ECS Agent. By default the flow’s tasks uses S3Result as outlined here. Is there any way to disable the task results for individual or all tasks in the flow while keeping the S3Storage on the flow?
a
Amanda Wee
04/09/2021, 10:00 AM
It sounds like you might want to disable checkpointing, e.g., by setting
Thanks Amanda! Reading through that page makes me realise I need to rethink my ideas a little here instead of actually disabling the checkpointing. For example, I might actually want to keep the S3Results and instead not pass actual results between tasks but a pointer to the results like uploading a big file to S3 within a task and pass the key for it to downstream tasks.
👍🏾 1
👍 1
d
Dylan
04/09/2021, 1:50 PM
Hi @Noah Holm!
When dealing with large volumes of data, I adopt the strategy you’ve outlined here 💯 .
As an FYI, I believe you can remove a result from an individual task in a Flow by specifying
result=None
in your Task decorator.
n
Noah Holm
04/09/2021, 3:04 PM
Nice Dylan 🙂
I did try
@task(result=None)
but it didn’t make any difference. I guess it shouldn’t since that’s the default value so how would it know that I explicitly set it to None and that it wasn’t the default when the flow is evaluated.
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.