Hi, is there a way to safely delete Prefect logs f...
# ask-community
f
Hi, is there a way to safely delete Prefect logs from Postgres without messing up scheduled flows? The tables log, task_run and task_run_state can grow rapidly in size, so I want to delete them every X days, but only for the ones that are ran and not scheduled.
k
Hi @Fabrice Toussaint, I’ll need to verify with the team if this is possible. I think we can only delete
flow_runs
from the graphQL API, and I don’t know if this deletes the logs.
f
Hi @Kevin Kho, thank you for your response 🙂. Let me know what the team says about this. If I delete everything all the scheduled flows break (obviously) and I need to reschedule them, this is annoying to do every day. 😛
k
You can use the graphQL API to just delete the successful (or failed) flow runs like this:
Copy code
query {
  flow_run (where: {state: {_eq: "Success"}}){
        id
    	  name
        state 
    	}
  }
  
mutation {
  delete_flow_run(input: { 
  flow_run_id: "02a4cd19-50cd-49b2-bf1b-ff632f0ded3a", 
  }) {
    success
  }
}
I just don’t know if it deletes entries in the task table too
f
Thanks again @Kevin Kho, I have two other questions: • I am new to GraphQL, so does the above delete all flows that were run succesfully or do I need to implement more (it seems to use a specific flow_run_id in this case • I also have "succesful" flows but which do not have a succesful state, what is the state of a flow which is scheduled (is it "scheduled")?
And I really need to know if it deletes the dat from the tables I mentioned, because these account for ~95% of the size of the data
k
The first one gets all flows run successfully, you’d need to programatically get the ids and pass them to the second query. The state of a scheduled flow is Scheduled. I don’t have confirmation yet, but will reach out when I do.
f
Thank you again! @Kevin Kho
👍 1
k
Unfortunately deleting flows not does propagate to deleting tasks. The advice here is to just truncate the logs table.
f
@Kevin Kho is there a way to execute a GraphQL query from a Prefect task then, I already have a task built for truncating the logs table, but I need the GraphQL query for the task_run and task_run_state table.
k
f
I have my GraphQL client up and running but I mean if I can create a flow which executes a GraphQL query on the client
k
Yes you can use
Client.graphql
inside a flow to run a GraphQL query..
f
Thank you! I will check this out 😄
k
Copy code
client = Client()

def build_query(days: int = 7) -> str:
    """
    Returns GraphQL query to get flows and state in
    last couple of days
    """
    begin_date = date.today() - timedelta(days=days)
    query = """
            query { 
                flow_run(where: { end_time: {_gt: \"""" + str(begin_date) + """\"} }) {
                id
                name
                state
                end_time
                }
            }
            """
    return query

@task(retry_delay=timedelta(seconds=60), max_retries=3)
def get_flows(query: str) -> list:
    result = json.loads(client.graphql(query).to_json())
    return result["data"]["flow_run"]
Here is a sample usage
f
@Kevin Kho thanks again, I see that GraphQL is only limited to 100 results, is this correct?
How do I use offset with GraphQL 😂
Nvm I got it
👍 1