m

    Michiel Verburg

    4 months ago
    What is the best practice for storing metadata about object locations (e.g. in various datalake zones) when using prefect to move this data around from zone to zone? We feel we lose track of where data lives or which data comes from which source after having initially processed it using a prefect Task if we do not somehow store metadata. However, we are unsure where we should be storing this metadata and if there are some best practices around it.
    Anna Geller

    Anna Geller

    4 months ago
    With Prefect 1.0 the answer is KV Store if you are on Prefect Cloud - this is the easiest way of storing arbitrary metadata as key-value pairs. But definitely, I can understand, metadata management is a hard problem. I wonder whether you can attach this metadata directly using your data lake catalog? e.g. S3 data lake allows attaching metadata using Glue catalog - most data lakes have some similar integrated solution
    in Prefect 2.0, we are working on features to address this more holistically
    m

    Michiel Verburg

    4 months ago
    Are these features trackable anywhere so I can follow or subscribe to the progress?
    Anna Geller

    Anna Geller

    4 months ago
    great question! The easiest to stay up to date on that is by following this Discourse topic - hit the 🔔 icon on Discourse to get email notifications - we also post about it through #announcements