This message was deleted.
# ask-marvin
s
This message was deleted.
m
thought for 215.2 seconds
Thanks for the detail — and I see the message cut off at the end. Can you confirm what you were planning to try “before pg_repack / VACUUM FULL”? If your goal is to resolve this with tuning and targeted maintenance first, here’s a concrete plan tailored to update‑heavy, small Prefect tables like
deployment
,
work_queue
, and the
automation_*
tables. Why this is happening - These specific tables are tiny but updated very frequently by the Prefect server (e.g., heartbeats, watermarks, poll times). That pattern creates lots of dead tuples quickly. - If autovacuum isn’t aggressive enough, or if long‑running transactions hold the xmin horizon, dead tuples won’t be reclaimed and bloat accumulates. - On tiny, hot tables, default autovacuum thresholds can be too high (scale factor never triggers), and frequent updates to the same pages can outpace vacuum. Step 1 — Check for blockers (critical) If VACUUM isn’t reducing dead tuples, a long‑running transaction is the usual culprit. Run these and share results:
Copy code
-- Any long-running transactions that can block pruning?
SELECT pid, usename, state, xact_start, query_start, backend_type, wait_event_type, wait_event, query
FROM pg_stat_activity
WHERE datname = current_database() AND xact_start IS NOT NULL
ORDER BY xact_start;

-- Table-level vacuum/analyze history
SELECT relname, n_live_tup, n_dead_tup, last_vacuum, last_autovacuum, vacuum_count, autovacuum_count, last_analyze, last_autoanalyze
FROM pg_stat_user_tables
WHERE relname IN ('deployment','work_queue','automation_event_follower','automation_bucket');

-- Any per-table autovacuum overrides already set?
SELECT c.relname, c.reloptions
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname NOT IN ('pg_catalog','information_schema')
  AND c.relname IN ('deployment','work_queue','automation_event_follower','automation_bucket');

-- Check XID age (sanity check)
SELECT c.relname, age(c.relfrozenxid) AS xid_age
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname NOT IN ('pg_catalog','information_schema')
  AND c.relname IN ('deployment','work_queue','automation_event_follower','automation_bucket');
If you see “idle in transaction” sessions hanging around for minutes/hours, consider: - Temporarily setting
idle_in_transaction_session_timeout
(e.g., 5m) - Killing offending sessions before running an aggressive vacuum Step 2 — Run an aggressive manual VACUUM and inspect output This will tell us if old snapshots are preventing cleanup.
Copy code
VACUUM (VERBOSE, FREEZE, ANALYZE) public.deployment;
VACUUM (VERBOSE, FREEZE, ANALYZE) public.work_queue;
VACUUM (VERBOSE, FREEZE, ANALYZE) public.automation_event_follower;
VACUUM (VERBOSE, FREEZE, ANALYZE) public.automation_bucket;
If the VERBOSE output mentions it could not remove tuples due to old snapshots, go back to Step 1 and clear blockers. Step 3 — Make autovacuum far more aggressive for these hot, tiny tables For very small, update-heavy tables, make vacuum/analyze trigger on small absolute changes rather than scale factors: ``` ALTER TABLE public.deployment SET ( autovacuum_enabled = true, autovacuum_vacuum_scale_factor = 0.0, autovacuum_vacuum_threshold = 20, autovacuum_analyze_scale_factor = 0.0, autovacuum_analyze_threshold = 10, autovacuum_vacuum_cost_limit = 2000, autovacuum_vacuum_cost_delay = 10 ); ALTER TABLE public.work_queue SET ( autovacuum_enabled = true, autovacuum_vacuum_scale_factor = 0.0, autovacuum_vacuum_threshold = 20, autovacuum_analyze_scale_factor = 0.0, autovacuum_analyze_threshold = 10, autovacuum_vacuum_cost_limit = 2000, autovacuum_vacuum_cost_delay = 10 ); ALTER TABLE public.automation_event_follower SET ( autovacuum_enabled = true, autovacuum_vacuum_scale_factor = 0.0, autovacuum_vacuum_threshold = 20, autovacuum_analyze_scale_factor = 0.0, autovacuum_analyze_threshold = 10, autovacuum_vacuum_cost_limit = 2000, autovacuum_vacuum_cost_delay = 10 );
ALTER TABLE public.automation_bucket SET ( autovacuum_enabled = true, autovacuum_vacuum_scale_factor = 0.0, autovacuum_vacuum_threshold = 20, autovacuum_analyze_scale_factor = 0.0, autovacuum_analyze_threshold = 10, autovacuum_vacuum_cost_limit = 2000, autovacuum_vacuum_cost_delay = 10 );
Copy code
Notes:
- Setting scale factors to 0 forces autovacuum/analyze to fire based on the absolute thresholds (good for small tables).
- You can tune thresholds (e.g., 10–50) based on observed churn.
- Consider temporarily setting `log_autovacuum_min_duration = 0` to log what autovacuum is doing and confirm it’s running frequently enough; revert after tuning.

Step 4 — Reindex concurrently if indexes are bloated
Even when you reclaim heap tuples, index bloat can linger.
REINDEX TABLE CONCURRENTLY public.deployment; REINDEX TABLE CONCURRENTLY public.work_queue; REINDEX TABLE CONCURRENTLY public.automation_event_follower; REINDEX TABLE CONCURRENTLY public.automation_bucket;
Copy code
Step 5 — Optional hardening for update-heavy tables: FILLFACTOR for HOT updates
For tiny tables that get updated constantly (and don’t change indexed columns), lowering fillfactor leaves room on each page to keep updates HOT, reducing index churn:
ALTER TABLE public.work_queue SET (fillfactor = 80); ALTER TABLE public.deployment SET (fillfactor = 80); ALTER TABLE public.automation_event_follower SET (fillfactor = 80); ALTER TABLE public.automation_bucket SET (fillfactor = 80); ``` Note: fillfactor applies to newly written pages. To fully realize the benefit you need a table rewrite (pg_repack or VACUUM FULL or CLUSTER). If you want to avoid locks, schedule
pg_repack --no-order --wait-timeout=...
during a quiet window. When to use pg_repack vs VACUUM FULL - If you confirm no old snapshots and aggressive VACUUM + REINDEX CONCURRENTLY isn’t enough,
pg_repack
is the online, lower-impact way to rewrite and shrink these tables. -
VACUUM FULL
will also work but will take an exclusive lock; since these are small tables the lock may be short but it will block writers. Prefect-specific notes - Prefect 3.4.7 has no known bug that would uniquely cause this bloat pattern; these tables are just update-heavy by design. If feasible, consider upgrading to the latest 3.x for general fixes. - For broader DB guidance and retention strategies, see: - Prefect docs: Database maintenance - Blog: More memory, more problems - Blog: Partitioning Postgres without downtime Next steps - Let me know: - Results of the queries in Step 1 and the VACUUM VERBOSE output - Whether you have constraints on downtime/locks (helps decide between pg_repack and VACUUM FULL) - If you’re open to setting table-level autovacuum overrides and timeouts (e.g.,
idle_in_transaction_session_timeout
) - I can help you pick exact thresholds and finalize a maintenance runbook once we see the blocker/age stats.