Hi everyone Is there a way to get download the timeline summ Prefect Community #ask-community

Hi everyone! Is there a way to get/download the ti...

Fina Silva-Santisteban

10/31/2021, 2:37 PM

Hi everyone! Is there a way to get/download the timeline summary of a flow run? E.g. a table and/or bar chart with completion time for each task? That would help with benchmarking and identifying bottlenecks. If I take one of our examples, you can see that there’s a task that takes a long time to run in the middle, and some that take less. It gives me an idea of potential bottlenecks and how the completion times relate to each other, so that’s good, but I don’t know how long each task actually takes unless I hover over each value. The first impression of the task in the middle being a bottleneck is still valid, but since it only takes 1minute to run we might be ok with that bottleneck. A summary, at least in table form, would help out a lot!

Anna Geller

10/31/2021, 3:03 PM

@Fina Silva-Santisteban afaik, there is no such option atm. But you can use GraphQL queries to create that. It would involve some work, especially to calculate the durations. Maybe this can help you get started: • query showing fields that may be helpful to create this dashboard:

Copy code

query {
  flow_run(
    where: { state: { _eq: "Success" } }
    order_by: { state_timestamp: desc }
    limit: 100
  ) {
    name
    flow_id
    scheduled_start_time
    start_time
    end_time
    task_runs {
      name
      id
      end_time
      start_time
    }
    version
    agent {
      type
    }
  }
}

• using client to execute the query

Copy code

from prefect import Client

client = Client()
query = """
paste the above
"""
client.graphql(query)

Fina Silva-Santisteban

11/01/2021, 12:31 PM

@Anna Geller This looks interesting! But I’m afraid I can’t quite follow: how is sending a flow run request manually using graphql going to provide me with completion time stats? 🤔

Anna Geller

11/01/2021, 12:38 PM

that’s what I meant that you would have to do some work on your end in Python code to calculate the duration times. What the GraphQL query would give you is raw data incl. start and end times of both flow runs and task runs. You could then use this data in Python code to make calculations and visualize it e.g. with matplotlib. I would be super interested to hear if anyone from the community has perhaps done it in the past and can share their approach.

Fina Silva-Santisteban

11/02/2021, 12:40 PM

@Anna Geller I use the prefect api to trigger flow runs and don’t send graphql requests myself that’s why I don’t know what I get returned if I was to send a graphql query myself. Does it make sense now why I don’t understand your suggestions? 🙂 Can you pls post links to docs that show exactly what it is I get returned from graphql, or whether I need to ping graphql myself to get that info?

Anna Geller

11/02/2021, 1:01 PM

@Fina Silva-Santisteban sure, here is the documentation you may want to check: https://docs.prefect.io/orchestration/concepts/api.html#getting-started when it comes to what do you get as a result, it is a dictionary. You get the same response whether you run it from Python or from API playground in the UI:

Copy code

from prefect import Client

client = Client()

query = """
query {
  flow_run(
    where: { state: { _eq: "Success" } }
    order_by: { state_timestamp: desc }
    limit: 100
  ) {
    name
    flow_id
		flow {
      name
      project {name}
    }
    scheduled_start_time
    start_time
    end_time
    task_runs {
      name
      id
      end_time
      start_time
    }
    version
    agent {
      type
    }
  }
}
"""
response = client.graphql(query)
print(response)

Output:

Copy code

{'data': {'flow_run': [a long list of your flow runs here]}}

Kevin Kho

11/02/2021, 2:59 PM

Hey @Fina Silva-Santisteban, I went over this and the query Anna gave is not creating a Flow run. It is for getting the flow run statistics after the Flow run already happened. You would need the GraphQL API to get the info and then manipulate it yourself. We don’t calculate duration either (the UI does that on the fly), so you need to manually do that using the start time and end time from the info you receive from the API.

Fina Silva-Santisteban

11/02/2021, 5:51 PM

Thank you both, that’s very helpful! 💪

3 Views

Open in Slack

Previous Next