What is the best practice for storing metadata abo...
# best-practices
m
What is the best practice for storing metadata about object locations (e.g. in various datalake zones) when using prefect to move this data around from zone to zone? We feel we lose track of where data lives or which data comes from which source after having initially processed it using a prefect Task if we do not somehow store metadata. However, we are unsure where we should be storing this metadata and if there are some best practices around it.
1
a
With Prefect 1.0 the answer is KV Store if you are on Prefect Cloud - this is the easiest way of storing arbitrary metadata as key-value pairs. But definitely, I can understand, metadata management is a hard problem. I wonder whether you can attach this metadata directly using your data lake catalog? e.g. S3 data lake allows attaching metadata using Glue catalog - most data lakes have some similar integrated solution
in Prefect 2.0, we are working on features to address this more holistically
m
Are these features trackable anywhere so I can follow or subscribe to the progress?
a
great question! The easiest to stay up to date on that is by following this Discourse topic - hit the 🔔 icon on Discourse to get email notifications - we also post about it through #announcements