Nikhil Joseph

08/25/2022, 4:04 PM
hey, so I have a weird problem.. 1. I have flows running with flow storage from gitlab, 2. dependencies from ecr image (run_config) 3. ecs agent (run_config) now I have another requirement of storing the results in s3😅 I register the flow from a different file(cli file) n add the flow storage and run config through (separate from the actual flow file). Everything works great till here, ecs pulls in the right image, grabs the flow from gitlab, runs fine. but I can't seem to get results working like this. 1. s3 storage works when I the storage to the flow in the flow file 2. s3 storage does not work when I add it through the cli file before registering. any idea what am doing wrong? feels like the storage I specify gets overwritten when the flow gets pulled in from gitlab (speculating though) Edit:I want results so that I can retry flows when they fail(if any1 was wondering). Also planning to write a scheduled flow that clears results of completed flows so that I don't have to worry about using too much storage (just 3mil records or so :3)

Mason Menges

08/25/2022, 4:48 PM
Hey @Nikhil Joseph Results are not part of the serialization schema so when you are setting the configuration in the CLI it doesn't get included once the flow is serialized and registered unless it's defined within the flow file so it can be evaluated at runtime. Cacheing might potentially fit this use case better
It's also worth noting that for failed tasks nothing gets stored as a result typically since the task failed and nothing is actually returned in that context, if you were wanting to persist the failure exceptions as a result you would need to setup the tasks to return the exception or setup a state handler similar to this, this is for sending slack notifications but you could adjust the logic to persist the results/exceptions in an s3 bucket instead.

Nikhil Joseph

08/26/2022, 8:07 AM
Thanks for getting back. what I meant for retrying, after retrying a flow, when the failed tasks run successfully, other tasks which had succeeded before won't run(again) n since its not saving the results anywhere, those task returns None n my flow fails (upstream_dependencies)
Guess the best option I have is to move to s3 storage (flow n results)