https://prefect.io logo
Title
a

Andrew Schechtman-Rook

12/23/2019, 8:13 PM
Hi prefectionists! I'm playing around with prefect core, and I'd love some advice/guidance on something I'm totally stuck on. I have a fair number of tasks which end up pulling data from somewhere, or spit out the result of a complicated operation I'd rather not repeat all the time. I'd like to be able to cache the results of these operations, but I want to cache to/from disk rather than having to pass around large datasets in
prefect.context.caches
. I've tried a bunch of things to try to do this, including: • Turning on caching, and specifying
LocalResultHandler
as the task/flow's
result_handler
. In this case caching works, but it doesn't use the
result_handler
to store/retrieve the task results. • Turning on
checkpointing
and specifying
LocalResultHandler
as
result_handler
. This works to save the result to disk, but that appears to be a one way operation - AFAICT there aren't any hooks in prefect to pull the checkpointed data back in when restoring from cache. • Using custom `state_handler`s to essentially intercept the results and do the file I/O. I'll admit I've spent the least time on this approach, it seems like it might work but I don't have a good enough grasp on prefect to know either how to implement it or even if it's a good idea. Anyone had to implement something like this, or have any additional ideas?