https://prefect.io logo
Title
a

An Hoang

04/13/2020, 4:15 PM
Hi everyone, does anyone know of a way to visualize the relationship between pandas dataframes into a Directed Acyclic Graph? Like
df_B
is obtained from
df_A
by passing it through some function
A_to_B_func()
.
df_D
is made by merging
df_B
with
df_C
etc. If it’s a complex function, yes we can write prefect Tasks to visualize it, but usually it’s an aggregate of small pandas functions and merge, so it would be a pain to wrap each of those into its own prefect task. Maybe I would need to write a complex context manager/decorator or a meta class wrapper around dataframe to record these operations automatically into a prefect task graph? Sorry the idea is not so clear in my head right now…
a

Alex Cano

04/13/2020, 4:22 PM
I’m not sure if there’s something specifically built for Pandas, but I’m sure you could get some of the functionality out of a generic call graph style chart. It’ll probably generate a graph of every part of your code (not just the pandas portions). Is that the kind of think you were looking for?
a

An Hoang

04/13/2020, 5:13 PM
hi @Alex Cano something along those lines but not quite. I want the nodes to be the pandas object and edges to be the functions
l

Leo Meyerovich (Graphistry)

04/13/2020, 5:23 PM
https://github.com/graphistry/pygraphistry goes directly from pandas 🙂
(but obv feel free to do something else 🙂 )
a

An Hoang

04/13/2020, 7:15 PM
Didn’t know that! I was going to respond to you after I dig a little deeper to graphistry, it looks very cool. Did not want to sound uninformed 🙂 . Thanks!
💪 1