Nicholas Pini
07/30/2025, 2:42 PMMarvin
07/30/2025, 2:42 PMMarvin
07/30/2025, 2:43 PMasyncio.exceptions.CancelledError
in your Prefect 3.x server:
1. What it is:
The CancelledError
is a normal part of Python's asyncio operation, particularly when tasks are being shut down or cancelled. In Prefect's context, it often appears during:
- Server shutdown processes
- Task timeouts
- Database connection closures
- Task cancellations
2. Why it's appearing in logs:
These errors in your server logs are likely related to one of these scenarios:
- Database connections being closed during task completion
- Tasks being cancelled during server operations
- Async operations being terminated during shutdown
3. Impact on Performance:
If you're seeing server slowdown, it might be related to:
- Too many concurrent database connections being opened and closed
- Resource cleanup not happening efficiently
- Tasks not being cancelled cleanly
4. Recommendations:
a) Check Database Configuration:
python
# In your server configuration
database_settings = {
"pool_size": 20, # Adjust based on your needs
"max_overflow": 10,
"pool_timeout": 30
}
b) Implement Proper Task Cleanup:
python
from prefect import flow
from contextlib import asynccontextmanager
@asynccontextmanager
async def managed_resources():
try:
yield
finally:
# Cleanup code here
pass
@flow
async def your_flow():
async with managed_resources():
# Your flow code here
pass
c) Monitor and Adjust:
- Keep an eye on your server's memory usage
- Monitor the number of concurrent tasks
- Consider implementing rate limiting if needed
5. Best Practices:
- Implement proper error handling for async operations
- Use appropriate timeout values for your tasks
- Consider implementing circuit breakers for database operations
- Monitor server resources regularly
Would you like me to provide more specific guidance based on your server setup or elaborate on any of these points?Nicholas Pini
07/30/2025, 2:45 PMMarvin
07/30/2025, 2:45 PMMarvin
07/30/2025, 2:46 PMyaml
server:
services:
pause_expirations:
enabled: true
loop_seconds: 10 # Increase from 5 to 10 seconds
2. Optimize Database Settings
yaml
server:
database:
sqlalchemy:
pool_size: 60 # Adjust based on your needs
max_overflow: 20 # Increase if needed
pool_timeout: 60 # Increase timeout
3. Monitor and Clean Up:
- Regularly clean up old flow runs
- Monitor database performance
- Check for and cancel any stuck paused flows
4. Resource Allocation:
If you're running in a containerized environment:
- Increase CPU allocation
- Increase memory limits
- Consider dedicated database resources
5. Database Maintenance:
- Regular vacuum operations on PostgreSQL
- Index maintenance
- Monitor connection pooling
Implementation Steps:
1. First, check your current database load:
sql
SELECT count(*)
FROM flow_run
WHERE state_type = 'PAUSED';
2. Update your Prefect configuration with adjusted timeouts:
bash
# Set via environment variables
export PREFECT_SERVER_SERVICES_PAUSE_EXPIRATIONS_LOOP_SECONDS=10
export PREFECT_SERVER_DATABASE_TIMEOUT=120
3. Monitor the impact:
- Watch server logs for continued warnings
- Monitor database performance metrics
- Check API response times
Best Practices:
1. Don't leave flows in a paused state indefinitely
2. Implement regular cleanup of old flow runs
3. Monitor database performance metrics
4. Consider implementing rate limiting for flow runs
Would you like me to provide more specific guidance based on your deployment setup (e.g., containerized, bare metal) or elaborate on any of these points?Nicholas Pini
07/30/2025, 3:05 PMMarvin
07/30/2025, 3:05 PMMarvin
07/30/2025, 3:07 PMyaml
server:
api:
keepalive_timeout: 30 # Increase from default
default_limit: 200 # Reduce if experiencing memory issues
database:
timeout: 120 # Increase database timeout
connection_timeout: 120
2. Optimize Database Settings:
yaml
server:
database:
sqlalchemy:
pool_size: 40
max_overflow: 10
pool_timeout: 60
3. Configure ASGI Server:
If you're using uvicorn (default for Prefect):
bash
uvicorn --timeout-keep-alive 60 \
--workers 4 \
--log-level warning \
--proxy-headers
4. Environment Variables:
bash
export PREFECT_API_REQUEST_TIMEOUT=60
export PREFECT_SERVER_API_KEEPALIVE_TIMEOUT=30
5. Memory Management:
yaml
server:
events:
maximum_size_bytes: 1000000 # Adjust based on your needs
maximum_labels_per_resource: 300
Troubleshooting Steps:
1. Enable Debug Logging:
bash
export PREFECT_LOGGING_LEVEL="DEBUG"
export PREFECT_SERVER_LOGGING_LEVEL="DEBUG"
2. Check Server Health:
bash
prefect server health-check
3. Monitor Database Connections:
For PostgreSQL:
sql
SELECT count(*) FROM pg_stat_activity;
4. Review Server Logs:
Look for patterns in the timing of ASGI exceptions:
bash
grep "Exception in ASGI application" server.log
Best Practices:
1. Implementation:
- Use connection pooling
- Implement proper error handling
- Set appropriate timeouts
- Monitor resource usage
2. Maintenance:
- Regular log review
- Database maintenance
- Resource monitoring
- Regular health checks
3. Performance Optimization:
- Use appropriate worker count
- Optimize database queries
- Implement caching where appropriate
- Regular cleanup of old data
Would you like me to:
1. Provide more specific configuration examples?
2. Explain how to implement any of these solutions in detail?
3. Help troubleshoot your specific ASGI exceptions with log analysis?Nicholas Pini
07/30/2025, 3:09 PMMarvin
07/30/2025, 3:09 PMMarvin
07/30/2025, 3:10 PMyaml
server:
services:
late_runs:
enabled: true # PREFECT_API_SERVICES_LATE_RUNS_ENABLED
loop_seconds: 60 # How often to check for late runs
after_seconds: "PT1M" # How long to wait before marking as late
When to Disable It:
You might want to disable this service if:
1. You don't need late run detection
2. You're experiencing performance issues and want to reduce server load
3. You have your own monitoring system for late runs
How to Configure:
Via environment variables:
bash
# Disable the service
export PREFECT_API_SERVICES_LATE_RUNS_ENABLED=false
# Or adjust the timing
export PREFECT_API_SERVICES_LATE_RUNS_LOOP_SECONDS=120
export PREFECT_API_SERVICES_LATE_RUNS_AFTER_SECONDS="PT5M"
Or in your Prefect configuration:
python
from prefect.settings import Setting
from prefect.server.services.late_runs import LateRunsSettings
late_runs_settings = LateRunsSettings(
enabled=False, # Disable the service
loop_seconds=120, # Check every 2 minutes
after_seconds="PT5M" # Mark as late after 5 minutes
)
Impact on Performance:
- The service runs periodic database queries to check for late runs
- If you have many flow runs, this can contribute to database load
- Disabling it can help if you're experiencing database performance issues
Best Practices:
1. Keep enabled if you need late run detection
2. Adjust loop_seconds
based on your needs (higher values = less frequent checks)
3. Set after_seconds
based on your acceptable delay threshold
4. Monitor database performance impact
Would you like me to:
1. Provide more specific configuration examples?
2. Explain how to monitor the impact of this service?
3. Help troubleshoot any specific issues you're seeing with late runs?Nicholas Pini
07/30/2025, 3:11 PMMarvin
07/30/2025, 3:11 PMMarvin
07/30/2025, 3:12 PMyaml
server:
services:
scheduler:
enabled: true # PREFECT_API_SERVICES_SCHEDULER_ENABLED
loop_seconds: 60 # How often to check for new schedules
deployment_batch_size: 100 # Number of deployments to process at once
max_runs: 100 # Maximum runs to schedule at once
min_runs: 3 # Minimum runs to maintain in schedule
max_scheduled_time: "P100D" # How far in advance to schedule
min_scheduled_time: "PT1H" # Minimum time to maintain in schedule
insert_batch_size: 500 # Batch size for database inserts
When to Disable It:
You might want to disable the scheduler service if:
1. You're using external scheduling systems
2. You want to manage flow run creation manually
3. You're experiencing performance issues and want to reduce server load
4. You're running in a distributed setup where another instance handles scheduling
How to Configure:
Via environment variables:
bash
# Disable the scheduler
export PREFECT_API_SERVICES_SCHEDULER_ENABLED=false
# Or adjust the timing
export PREFECT_API_SERVICES_SCHEDULER_LOOP_SECONDS=120
export PREFECT_API_SERVICES_SCHEDULER_MAX_RUNS=50
Impact on Performance:
- The scheduler service periodically queries the database for deployments
- It creates new flow runs based on schedules
- Heavy scheduling load can impact database performance
- Large deployment_batch_size
or max_runs
can cause spikes in database usage
Best Practices:
1. Tuning for Scale:
yaml
server:
services:
scheduler:
loop_seconds: 120 # Increase for less frequent checks
deployment_batch_size: 50 # Reduce for lower database load
max_runs: 50 # Limit concurrent scheduling
2. Database Optimization:
yaml
server:
database:
timeout: 120 # Increase for larger scheduling operations
connection_timeout: 120
3. Monitoring:
- Watch for scheduler service warnings in logs
- Monitor database performance during scheduling operations
- Track scheduling latency
4. Maintenance:
- Regularly clean up old flow runs
- Monitor scheduling patterns
- Adjust batch sizes based on load
Would you like me to:
1. Provide more specific configuration examples?
2. Explain how to monitor the scheduler's impact?
3. Help troubleshoot any specific scheduling issues?