<@ULVA73B9P> in my workload, currently the logs fr...
# ask-marvin
r
@Marvin in my workload, currently the logs from flows and are persisted in logs table in postgres. i see a potential of logs table growing in size rapidly and can become a problem in future querying. Is there a partition support for this table or what is the best way to manage logs?
m
thought for 119.9 seconds
Great question — log growth can sneak up fast. Short answer - Prefect 3.x does not provide built-in Postgres partitioning for the logs table. - Best practice is to keep a short retention window in Postgres and ship long-term logs to an external system (CloudWatch, GCP Logging, ELK, etc.), or disable API log storage entirely if you don’t need logs in the Prefect UI. Options to manage growth 1) Reduce or disable what’s written to the API (DB) - Lower the logging level: - Set
PREFECT_LOGGING_LEVEL=WARNING
(or
ERROR
) to cut volume. - Disable API log storage entirely: -
PREFECT_LOGGING_TO_API_ENABLED=false
- Tradeoff: logs will not appear in the Prefect UI, so pair this with an external sink. - Other tuning: -
PREFECT_LOGGING_LOG_PRINTS=false
(avoid capturing
print()
as logs) - Remove noisy loggers via
PREFECT_LOGGING_EXTRA_LOGGERS=...
- Reference settings: Settings reference 2) Keep API logs but add a retention policy in Postgres - Run a scheduled cleanup job (via pg_cron, cronjob, etc.) to delete older rows and vacuum. Example (adjust table/column names to match your schema and your desired retention):
Copy code
-- Delete logs older than 30 days
DELETE FROM logs
WHERE timestamp < NOW() - INTERVAL '30 days';

-- Reclaim space and keep planner stats fresh
VACUUM (ANALYZE) logs;
- For performance, ensure there is an index on the timestamp column used in the predicate. - If you expect very large volumes, consider running deletes in chunks to avoid long locks, e.g.:
Copy code
DELETE FROM logs
WHERE timestamp < NOW() - INTERVAL '30 days'
ORDER BY timestamp
LIMIT 100000;
…loop that until no rows remain. - Guidance: Database maintenance 3) Externalize logs for long-term storage/search - Keep a short retention window in Postgres (e.g., 7–14 days) for UI visibility, and send full logs elsewhere. - Provide a custom logging config via
PREFECT_LOGGING_CONFIG_PATH
that adds your preferred handlers (CloudWatch/GCP Logging/ELK/etc.). Example minimal YAML that disables API logging and sends JSON logs to stdout for your log collector:
Copy code
version: 1
disable_existing_loggers: false
formatters:
  json:
    format: '{"ts":"%(asctime)s","level":"%(levelname)s","logger":"%(name)s","msg":"%(message)s"}'
handlers:
  console:
    class: logging.StreamHandler
    stream: <ext://sys.stdout>
    formatter: json
loggers:
  prefect:
    level: INFO
    handlers: [console]
    propagate: False
root:
  level: INFO
  handlers: [console]
Set:
Copy code
PREFECT_LOGGING_CONFIG_PATH=/path/to/logging.yml
PREFECT_LOGGING_TO_API_ENABLED=false
4) Partitioning (advanced, use with caution) - Native Postgres partitioning (e.g., by month on the log timestamp) can help, but it’s not something Prefect manages or migrates for you. If you adopt partitioning: - Test thoroughly in a staging DB. - Be aware future Prefect DB migrations expect the standard table shape; custom partitioning can complicate upgrades. - Tools like pg_partman can help automate partition maintenance. Recommendations - If you need logs in the Prefect UI: keep a modest retention window in Postgres (e.g., 7–14 days) + nightly cleanup + external sink for long-term storage. - If you don’t need UI logs: set
PREFECT_LOGGING_TO_API_ENABLED=false
and route all logs to your centralized logging platform. A couple of questions to tailor the plan: - How long do you need logs visible in the Prefect UI? - Are you open to disabling API logs entirely? - What Postgres version are you on, and do you have pg_cron or a similar scheduler available? I can provide a ready-to-run cleanup job or a logging config for your target sink once I know your retention window and environment.