https://prefect.io logo
Title
r

Ricardo Gaspar

11/08/2022, 4:35 PM
hey there. Is there any plan for a
openlineage
integration? For ref: • https://openlineage.io/https://github.com/OpenLineage/OpenLineage/issues/81 • there was one for Prefect v1 https://github.com/OpenLineage/OpenLineage/pull/293 CC @limx0 @Anna Geller
a

Anna Geller

11/08/2022, 11:33 PM
Ttbomk there is not at the moment, what problem are you trying to solve with Open Lineage? For transparency, I'm not a big fan of this project but I'm very interested in the underlying problems you would like to address
r

Ricardo Gaspar

11/09/2022, 10:57 AM
I’m interested on using Marquez to get dataset lineage and metadata management-discovery. Would you suggest open metadata instead? I don’t know if it integrates with Spark (scala)
a

Anna Geller

11/09/2022, 12:42 PM
Marquez doesn't give any advantage to workflow metadata that Prefect already provides in the UI. But if you are interested in metadata of your data rather than your workflow, then there are many great tools you could explore including OpenMetadata, DataHub, Atlan, Stemma and tens if not hundreds other metadata tools
🙏 1
😉 1
r

Ricardo Gaspar

02/10/2023, 5:57 PM
Just revisiting this. I like Open Metadata, seems very interesting; didn’t play with it yet. But when it gets to a major release it seems that it will be a in a better stage. Answering your question, what I’d like is to have a single tool/framework that would be able to get data lineage from spark (ideally column level) as well as from the orchestration tool (prefect in this case; airflow has some integration with OpenLineage and Marquez).
a

Anna Geller

02/10/2023, 5:59 PM
there are like millions of lineage tools on the market, but the problem is more that everyone has a different understanding of what lineage really is
if you are on Prefect Cloud, you'll be able to do a lot with Automations in a more actionable way than most data catalogs offer, but if you need a data catalog, you'd need to do a PoC for that specific tool, afaik no lineage tool integrates Prefect workflow metadata yet, but you don't necessarily have to, it all depends on what you're trying to accomplish
😉 1
🙏 1
:thank-you: 1
b

Brad

02/15/2023, 10:59 PM
hey @Ricardo Gaspar - I'm still interested in this space, and Open Metadata looks pretty interesting. I might and have a play around with a perfect integration - would you be keen to contribute?