Ryan Abernathey
09/09/2019, 6:26 PMResultHandler
objects. In Pangeo, our I/O stack is something like Google Cloud Storage <- GCSFS <- Zarr -< Xarray.
I would like a Prefect task to write data to GCS. The normal way I would do this (without Prefect) is:
python
ds = # ... create xarray Dataset
gcfs_w_token = gcsfs.GCSFileSystem(project='pangeo-181919', token=token)
gcsmap = gcsfs.GCSMap(path, gcs=gcfs_w_token)
ds.to_zarr(gcsmap)
Obviously I can do that from within a Prefect task, but it kind of seems like I should be using a ResultHandler
. Can you point me to any examples of custom handlers? (Bonus points if they show how to use secure credentials.)
Thanks again for an awesome tool.Chris White
09/09/2019, 6:41 PMread
/ write
methods that are inverses of each other (and it needs to be cloudpickle-able for running on dask). For example, here is our internal implementation of a GCS result handler: https://github.com/PrefectHQ/prefect/blob/master/src/prefect/engine/result_handlers/gcs_result_handler.py
This implementation won’t be nearly as performant as using gcfs
, but should convey the idea. This handler also uses “Prefect Secrets” --> when running locally, secrets are pulled from prefect.context
, and can be set via environment variable (e.g., export PREFECT__CONTEXT__SECRETS="my-secret"
). If you need added security, you could use an encryption package for parsing the secret.Ryan Abernathey
09/09/2019, 6:43 PMChris White
09/09/2019, 6:43 PM@task(checkpoint=True, result_handler=my_handler())
- the appropriate setting needs to be turned on via env var / config: export PREFECT__FLOWS__CHECKPOINTING=true
during executionresult_handler
keyword as aboveRyan Abernathey
09/09/2019, 6:47 PMChris White
09/09/2019, 6:49 PMMarvin
09/09/2019, 6:49 PM