are there any limits of what should be stored in a prefect context? I.e. is it a bad idea to store a...
k
are there any limits of what should be stored in a prefect context? I.e. is it a bad idea to store a dataframe in prefect.context so that I can pull this dataframe from a state_handler? My alternative (I think) would be writing the dataframe to disk, then setting the location of that dataframe in the context and then pulling the dataframe location from context in the state hanlder.
k
In general, the use of Prefect context to store stuff is not advised, but I understand that in general, it’s very hard to get stuff from a task into the state handler. This is more likely to do with a bad design of the state handler. I don’t think there would be anything wrong, though just note it won’t be in the context for downstream tasks because the context is not quite mutable. Could you tell me more about the use case?
k
Sure - basically I have a task that is submitting a bunch of jobs to aws. In the dataframe, i have a bunch of s3 paths to which these jobs will write stuff to. When the task is done (succeeded / failed) i want the state handler to check whether each of the files exist and record that information to the dataframe
maybe this isn't the best / intended use of a state handler.. but the state handler is nice because I can guarantee it runs every time I run this task and don't have to remember to tack another task onto the end of my flow, and also can change the state of the task if needed based on whether all the s3 paths exist or not
k
Yeah I think that would work but a new task seems more appropriate and you can
raise FAIL
or
raised SUCCESS
based on the content of the DataFrame? You can also use
trigger=always_run
. You may also put the logic in the same task?