<@ULVA73B9P> i have a self hosted prefect with a p...
# ask-marvin
a
@Marvin i have a self hosted prefect with a postgresql database. The database is getting larger and larger, because i am keeping all runs for ever. Is there a way to delete old runs after a certain period
m
To manage the size of your Prefect database by deleting old runs, you can use Prefect's API to filter and delete runs based on a specified condition, such as age or state. For Prefect 3.x, you can utilize the Prefect Client to delete flow runs. Here's a general idea of how you can achieve this: 1. Read Flow Runs: Use the
read_flow_runs
method to retrieve runs that you are interested in deleting. You can filter them by state or expected start time. 2. Delete Flow Runs: Once you have your list of runs, use the
delete_flow_run
method to delete them based on their IDs. Here's a succinct overview based on the Prefect 3.x Docs: - Fetch and filter the runs:
Copy code
python
  flow_runs = await client.read_flow_runs(
      flow_run_filter=FlowRunFilter(
          state=dict(name=dict(any_=["Late"])),
          expected_start_time=dict(before_=datetime.now(timezone.utc)),
      )
  )
- Delete the runs:
Copy code
python
  for run in flow_runs:
      await client.delete_flow_run(flow_run_id=run.id)
For Prefect 2.x, the process is quite similar but ensure that the client and functions are compatible with the specific API version of Prefect you're using. If you wish to delve deeper into setting up this process or need more robust examples, you may want to explore the Management of Prefect Database in Prefect 2.x, or review the relevant API Interaction Documentation for the latest version. I'd love to provide a witty remark here, but deleting things sometimes needs a serious note, even for an android.
a
@Marvin in 2.x , can i write a flow to clean up the database
m
In Prefect 2.x, you can certainly write a flow to automate the cleanup of old database records, such as deleting flow runs that are no longer needed. To do this, you would use Prefect's Client API, which allows you to interact programmatically with your Prefect server. Here's a general outline of how you can create a cleanup flow: 1. List Flow Runs: Use the client to list flow runs that match certain criteria, such as those older than a specific date or in a particular state (e.g., "Completed"). 2. Delete Flow Runs: Iterate over these flows and delete them using their IDs. Here's a simplified structure based on what you might do with the client:
Copy code
python
from prefect import Flow
from prefect.client import Client
from datetime import datetime, timedelta

def delete_old_runs():
    client = Client()
    # Define your criteria for old runs, e.g., older than 30 days
    old_date = datetime.utcnow() - timedelta(days=30)
    
    # Fetch old flow runs
    flow_runs = client.read_flow_runs(
        name=None,
        state=None,  # Specify states if needed, like "Completed", "Failed" etc.
        start_time=datetime(2020, 1, 1),  # Fetch runs starting from a past date
        limit=100  # Adjust limit as needed
    )
    
    # Delete old runs
    for flow_run in flow_runs:
        run_start_time = flow_run.created  # Assume flow_run has a created attribute
        if run_start_time < old_date:
            client.delete_flow_run(flow_run.id)
            print(f"Deleted flow run: {flow_run.id}")

with Flow("Cleanup Old Runs") as flow:
    delete_old_runs()

flow.run()
This is a very basic example, and you'll need to adjust it to suit your specific logic and criteria. If you're using SQLAlchemy or another ORM to interact with the database directly, you would also automate the connection and deletion process there. For more tailored guidance, check the Prefect 2.x Client Documentation. You can customize the query parameters to fine-tune which flows are targeted by your cleanup process. It's amazing how even mundane database maintenance can feel like sending old data to a happy place... or the recycle bin.