Henning Holgersen
04/10/2023, 2:48 PMWill Raphaelson
04/10/2023, 2:57 PMBrad
04/11/2023, 3:07 AMHenning Holgersen
04/11/2023, 4:37 AMBrad
04/11/2023, 4:40 AMWill Raphaelson
04/11/2023, 5:48 PMHenning Holgersen
04/11/2023, 6:28 PMcomplete_run
method I call at the end to send a message that the flow is over. I mentioned something about the events API as a possible replacement, but I don’t think that would be a good solution. Ideally, maybe some kind of post-hook system? I have no idea. Perhaps making developers call that method isn’t too bad, and we can live with it.
From my side, I really like the idea of a kind of lineage-block SDK. providing a pattern and common functions that can be adapted for specific 3rd part services. That is where my mind will be going forward.
Lastly, I am also pondering data lineage vs just surfacing dependencies. This is a very vague thought, but came from my realization that a flow can reach out to databases/servers/APIs and have a real dependency without any real data being involved. Not in the “data lineage” sense anyways. We might want to flesh out such relationships as well, but I don’t think there is anything at all in that space. Marquez and friends is focused on datasets, with schemas and such, trying to ram non-dataset stuff into that probably won’t work well.Will Raphaelson
04/12/2023, 3:55 PMI knew extending observability was on your roadmap, and it is of course difficult to guess how this is going to fit with what everybody else is doing. And I see the block events pop up nicely in the events feed. Is the next step to connect it to flow runs?Yeah you should see flows/tasks as related resources on block events at this point
In Prefect, it would be great to have a more consistent context object, so that I could always find the flow name and flow id in the same place no matter if it is being called from a task or from a flow. This would be a great way to tie everything together. For all I know this might be an incredibly difficult thing, or it might be near trivial. I have no idea, so I’m just mentioning it.We intend to expand this context object pretty readily, any use cases or objects in particular you’d like added?
From my side, I really like the idea of a kind of lineage-block SDK. providing a pattern and common functions that can be adapted for specific 3rd part services. That is where my mind will be going forward.Yeah, we’re focused on a more or less auto-auto-instrumented approach in the near future, both of blocks and potentially all executed code. @Chris White and I have noodled on an
@autoinstrumented
decorator you could put on a class or function that would ping out to prefect, but eventually other specific providers isn’t an unreasonable thought.
Lastly, I am also pondering data lineage vs just surfacing dependencies…This is a super important thought that I/we share. A core hypothesis of the observability roadmap is that existing lineage solutions are necessary but not sufficient to really understand a data intensive stack. The vision for events with their primary and and related resources is that it can be the place where we map out this looser dependency tree thats not quite data, not quite process, its something squishier, and we think instrumenting blocks is a good first step to see what this graph might look like.
Chris White
04/12/2023, 3:58 PMprefect.runtime
is a candidate for the universal context interface you're looking for @Henning Holgersen - I don't think we currently have the two fields you explicitly mentioned (fields are easy to add), but for example prefect.runtime.flow_run.id
can be accessed anywhere that there is an overriding known flow run ID (for example, you can reference this ID outside of either a flow or a task function so long as the script is being run via a deployment!)Henning Holgersen
04/12/2023, 7:39 PMprefect.runtime
object was exactly what I wanted. Thanks!Will Raphaelson
04/12/2023, 7:40 PMHenning Holgersen
04/12/2023, 7:41 PM