<@ULVA73B9P> I'm self-hosting and getting the foll...
# ask-marvin
o
@Marvin I'm self-hosting and getting the following error:
Copy code
09:13:53.788 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 47.755087 seconds to run, which is longer than its loop interval of 20.0 seconds.
Also the server is using a lot of CPU.
m
Your issue with the
CancellationCleanup
service taking longer than its loop interval and the server's high CPU usage might be related to database locking. This can occur when the
CancellationCleanup
service attempts to process a large number of flow or subflow runs, leading to prolonged database transactions. To mitigate this, you can try the following: 1. Disable the Cancellation Cleanup Service: Set the environment variable
PREFECT_API_SERVICES_CANCELLATION_CLEANUP_ENABLED
to
false
. This should prevent the service from running and potentially locking the database. 2. Increase the Loop Interval: If disabling the service is not ideal, you can try increasing the loop interval by setting the environment variable
PREFECT_API_SERVICES_CANCELLATION_CLEANUP_LOOP_SECONDS
to a higher value. However, note that this might not completely resolve the issue if the cleanup task itself is inherently long-running. These steps are based on similar issues reported in the Prefect GitHub repository. For more details, you can refer to this GitHub issue. If these steps don't resolve your issue, consider reviewing your flow structure and database configuration to identify any potential bottlenecks.
o
@Marvin What is the normal CPU usage for a prefect server managing around 4500 flows a week
m
The CPU usage of a Prefect server can vary significantly based on several factors, including the complexity of the flows, the concurrency of task execution, the infrastructure it is running on, and the database operations involved. There isn't a "normal" CPU usage figure that applies universally, but I can provide some guidance: 1. Flow Complexity: More complex flows with many tasks, especially if they involve heavy computation or data transfer, will naturally use more CPU. 2. Concurrency: Running multiple flows or tasks concurrently will increase CPU usage. 3. Database Operations: Prefect's server interacts frequently with its database to track flow and task states. High database activity can contribute to CPU usage. 4. Infrastructure: The underlying hardware or virtual machine specifications (CPU cores, memory, etc.) will affect how CPU usage is perceived. For a setup managing around 4500 flows a week, you might expect moderate CPU usage during peak times when many flows are running simultaneously. However, if you notice consistently high CPU usage, it may be worth investigating specific bottlenecks or inefficiencies in flow design or server configuration. Monitoring tools and performance profiling can help identify specific areas of high resource consumption. If you're concerned about performance, you might consider scaling your infrastructure or optimizing flow execution patterns.
o
@Marvin Is it normal for a prefect server to use ~0.7 CPU at all time. 1. Tasks and Flows run on separate workers, and not directly on the server. 2. We run ~4500 flow a week. Most don't have complex structure.
m
Given your setup, where tasks and flows run on separate workers and the server itself is primarily handling orchestration and state tracking, a constant CPU usage of ~0.7 might be on the higher side, but not necessarily unusual depending on the server's role and configuration. Here are some considerations: 1. Server's Role: If your Prefect server is responsible for managing a high volume of flow and task state updates, it will naturally have some baseline CPU usage. This includes handling API requests, database interactions, and possibly scheduling tasks. 2. Database Activity: If the server is also managing a database or interacting heavily with one, this can contribute to the CPU load. Ensure that your database is optimized and that queries are efficient. 3. Background Services: Prefect servers run several background services (like the cancellation cleanup service you mentioned). These services can contribute to CPU usage, especially if they're processing a lot of state changes or other operations. 4. Monitoring and Logging: If you have extensive logging or monitoring configured, this can add to the CPU load, particularly if logs are being processed or sent to an external system. If the CPU usage doesn't fluctuate significantly with the number of flows being managed or tasks being executed, it might be worth investigating whether any specific service or process is consuming more resources than expected. Monitoring tools can help pinpoint specific processes or operations contributing to the load. If performance becomes a concern, consider scaling your infrastructure or optimizing server configurations.
o
@Marvin Could it be cause due to large amount of records of past events? Is there any retention time that can be adjusted in order to clean the DB from extremely old records?
m
While I couldn't find explicit documentation on configuring data retention policies for Prefect's database, it's reasonable to consider that a large number of past event records could impact performance. Here are some steps and considerations to manage old records: 1. Database Management: Prefect supports both SQLite and PostgreSQL. If you're using SQLite, consider migrating to PostgreSQL for better performance and scalability. PostgreSQL allows for more sophisticated management of large datasets. 2. Manual Cleanup: You might need to implement a manual cleanup process. This could involve writing scripts to delete old records from tables such as flow runs, task runs, and logs. Make sure to back up your database before performing any deletion operations. 3. Database Reset: If the data is not needed for historical purposes, you can reset the database. This is more drastic and will remove all data. Use this option only if you're sure that the old data is no longer needed. 4. Database Optimization: Regular maintenance tasks like vacuuming the database (for SQLite) or using tools like
pg_repack
for PostgreSQL can help optimize performance. For more information on managing and configuring Prefect's database, you can refer to the Prefect self-hosted documentation. If you need a more tailored solution, consider reaching out to Prefect support or consulting the community for best practices.