a

    An Hoang

    2 years ago
    Hi everyone, does anyone know of a way to visualize the relationship between pandas dataframes into a Directed Acyclic Graph? Like
    df_B
    is obtained from
    df_A
    by passing it through some function
    A_to_B_func()
    .
    df_D
    is made by merging
    df_B
    with
    df_C
    etc. If it’s a complex function, yes we can write prefect Tasks to visualize it, but usually it’s an aggregate of small pandas functions and merge, so it would be a pain to wrap each of those into its own prefect task. Maybe I would need to write a complex context manager/decorator or a meta class wrapper around dataframe to record these operations automatically into a prefect task graph? Sorry the idea is not so clear in my head right now…
    a

    Alex Cano

    2 years ago
    I’m not sure if there’s something specifically built for Pandas, but I’m sure you could get some of the functionality out of a generic call graph style chart. It’ll probably generate a graph of every part of your code (not just the pandas portions). Is that the kind of think you were looking for?
    a

    An Hoang

    2 years ago
    hi @Alex Cano something along those lines but not quite. I want the nodes to be the pandas object and edges to be the functions
    l

    Leo Meyerovich (Graphistry)

    2 years ago
    https://github.com/graphistry/pygraphistry goes directly from pandas 🙂
    (but obv feel free to do something else 🙂 )
    a

    An Hoang

    2 years ago
    Didn’t know that! I was going to respond to you after I dig a little deeper to graphistry, it looks very cool. Did not want to sound uninformed 🙂 . Thanks!