Hi all, I am seeing the following errors in my sel...
# ask-community
t
Hi all, I am seeing the following errors in my self-hosted prefect server when under load. How concerned should I be?
Copy code
13:15:55.415 | WARNING | prefect.server.services.failexpiredpauses - FailExpiredPauses took 29.592499 seconds to run, which is longer than its loop interval of 5.0 seconds.
13:15:55.444 | WARNING | prefect.server.services.flowrunnotifications - FlowRunNotifications took 29.644721 seconds to run, which is longer than its loop interval of 4 seconds.
13:16:28.024 | WARNING | prefect.server.services.failexpiredpauses - FailExpiredPauses took 6.629446 seconds to run, which is longer than its loop interval of 5.0 seconds.
13:16:28.037 | WARNING | prefect.server.services.marklateruns - MarkLateRuns took 6.655057 seconds to run, which is longer than its loop interval of 5.0 seconds.
13:16:28.039 | WARNING | prefect.server.services.recentdeploymentsscheduler - RecentDeploymentsScheduler took 6.648737 seconds to run, which is longer than its loop interval of 5 seconds.
13:16:42.554 | WARNING | prefect.server.services.recentdeploymentsscheduler - RecentDeploymentsScheduler took 5.717242 seconds to run, which is longer than its loop interval of 5 seconds.
13:16:42.569 | WARNING | prefect.server.services.flowrunnotifications - FlowRunNotifications took 5.730481 seconds to run, which is longer than its loop interval of 4 seconds.
13:16:42.573 | WARNING | prefect.server.services.failexpiredpauses - FailExpiredPauses took 5.735392 seconds to run, which is longer than its loop interval of 5.0 seconds.
13:16:42.613 | WARNING | prefect.server.services.marklateruns - MarkLateRuns took 5.777756 seconds to run, which is longer than its loop interval of 5.0 seconds.
13:17:10.327 | WARNING | prefect.server.services.flowrunnotifications - FlowRunNotifications took 19.093586 seconds to run, which is longer than its loop interval of 4 seconds
u
hi Tim while it's not necessarily a problem that a loop takes 1 second longer to run than expected, it does mean that your DB may have degraded performance under load. i think you'll be fine until you start noticing slower responses on your flow-runs / server. there are a couple of things you could try -- increase the resources you've allocated to your DB, you can try running a cleaning job to remove the amount of data in your DB, or just increase the timing settings to silence the warnings.
t
Thats good to hear - thanks for that! Is there a supported way of cleaning the database? Would this happen to be a prefect type operation (say dropping all flow runs before a date), or is it something that is more database orientated that nwould need to be performed directly on the database engine? Getting more resources is also an option that I can explore as well.
u
we don't have a supported way of doing that but what i would do is write a script that iterates through the various objects that get looped thru and drop them according to a date field. and then id schedule that script to run when the database is under less load -- maybe even schedule it thru prefect 😄
t
I could probably do that. Is there a set of objects that are the ideal candidates, i.e. which rows in which tables? Are there any untility functions in prefect to remove all associated records related to a particular task or flow is?
u
try taking a look at the code for each of those loop services to see what they're looking for in the DB -- for example, this is what the failexpiredpauses loop is doing. use that as a starting point for cleaning things up by periodically deleting the objects it would be querying/iterating through