Thanks <@UKNSNMUE6> for stopping by the Pangeo ML ...
# prefect-community
Thanks @Chris White for stopping by the Pangeo ML Working Group meeting today. I’ve got a couple of follow-up questions. Let me know if any of these should be escalated to GitHub issues. The main question is about
objects. In Pangeo, our I/O stack is something like Google Cloud Storage <- GCSFS <- Zarr -< Xarray. I would like a Prefect task to write data to GCS. The normal way I would do this (without Prefect) is:
Copy code
ds = # ... create xarray Dataset
gcfs_w_token = gcsfs.GCSFileSystem(project='pangeo-181919', token=token)
gcsmap = gcsfs.GCSMap(path, gcs=gcfs_w_token)
Obviously I can do that from within a Prefect task, but it kind of seems like I should be using a
. Can you point me to any examples of custom handlers? (Bonus points if they show how to use secure credentials.) Thanks again for an awesome tool.
😁 2
Hey @Ryan Abernathey! Good question; at the end of the day, a result handler is simply an object with
methods that are inverses of each other (and it needs to be cloudpickle-able for running on dask). For example, here is our internal implementation of a GCS result handler: This implementation won’t be nearly as performant as using
, but should convey the idea. This handler also uses “Prefect Secrets” --> when running locally, secrets are pulled from
, and can be set via environment variable (e.g.,
export PREFECT__CONTEXT__SECRETS="my-secret"
). If you need added security, you could use an encryption package for parsing the secret.
This seems very useful. Thanks! How do I associate a result with a specific handler?
👍 1
To actually trigger this result handler call, you need to “checkpoint” your Task (Prefect has a bias against storing data unnecessarily, unless users opt-in). Two things are necessary to make checkpointing work: - tasks need to request checkpointing and set their result handler:
@task(checkpoint=True, result_handler=my_handler())
- the appropriate setting needs to be turned on via env var / config:
during execution
Task result handlers can be specified via the
keyword as above
I’ll give this a try and report back. Thanks
yea anytime! I’m super excited to hear the Pangeo group’s feedback and possible work with you all to improve Prefect!
@Marvin archive “How can I create and set a custom result handler?”