<@ULVA73B9P> my RDS DB for prefect is generating a...
# ask-marvin
e
@Marvin my RDS DB for prefect is generating a lot of errors like:
Copy code
ERROR:  deadlock detected
DETAIL:  Process 13237 waits for ShareLock on transaction 40157444; blocked by process 13098.
Process 13098 waits for ShareLock on speculative token 1 of transaction 40157445; blocked by process 13237.
Process 13237: INSERT INTO task_run (flow_run_id, task_key, dynamic_key, flow_run_run_count, empirical_policy, task_inputs, tags, labels, name, run_count, expected_start_time, total_run_time, id, created, updated) VALUES ($1::UUID, $2::VARCHAR, $3::VARCHAR, $4::INTEGER, $5, $6, $7::JSONB, $8::JSONB, $9::VARCHAR, $10::INTEGER,
Process 13098: UPDATE task_run SET state_id=$1::UUID, state_type=$2::state_type, state_name=$3::VARCHAR, state_timestamp=$4::TIMESTAMP WITH TIME ZONE, updated=now() WHERE task_run.id = $5::UUID AND (task
Any idea whats actually causing this and how to resolve?
m
thought for 322.2 seconds
Thanks for sharing the error details — that’s a classic Postgres deadlock pattern involving Prefect’s
task_run
writes. What’s happening - You’ve got two concurrent transactions touching the same
task_run
row: - one is inserting/upserting a task run record (from the TaskRunRecorder/event ingestion path) - another is updating the state on that same row - The “speculative token” in your log means Postgres is doing INSERT … ON CONFLICT while another session is updating the row; they end up waiting on each other and Postgres aborts one to resolve the cycle. - This aligns exactly with an ongoing report in Prefect 3.x: prefecthq/prefect#17767. Your snippet (INSERT into
task_run
+ concurrent UPDATE of
task_run
) matches the deadlock graphs posted there. - The warnings about CancellationCleanup running longer than its loop interval are likely a symptom of DB contention (queries waiting on locks or slow I/O), which increases the window where deadlocks can occur. Immediate checks and mitigations 1) Confirm environment details (this helps us narrow it down) - Prefect version and Helm chart version - Number of API/server replicas - RDS Postgres version and instance size / storage type - Any connection poolers (e.g., PgBouncer) in front of RDS - Typical peak concurrency (number of simultaneous flow runs / workers) 2) Try reducing concurrent writers to isolate the issue - If you’re running multiple Prefect API pods, scale down to a single replica temporarily. This will reduce the number of processes simultaneously inserting/updating
task_run
rows and often eliminates the deadlock entirely. If that solves it, we can discuss safe ways to reintroduce concurrency. 3) Upgrade to the latest 3.x - We’ve made incremental improvements around event recording and services since 3.3.x. If you’re not on the latest 3.4.x, upgrading is recommended. - Release notes: Latest Prefect releases 4) Turn on Postgres-level visibility and shorten lock waits - In your RDS parameter group: -
log_lock_waits = on
-
deadlock_timeout = '200ms'
(or ~200–500ms) - Optionally set timeouts so lock waits don’t snowball: -
lock_timeout = '2s'
-
statement_timeout = '60s'
- This will produce clearer deadlock reports in the RDS logs and keep the system from getting stuck behind long lock waits. Prefect will generally retry failed statements as part of its operations, but we want to confirm impact in your environment. 5) Check for active blockers when this happens Run these when you see the warnings/deadlocks to capture what’s blocking:
Copy code
select now(), pid, usename, state, wait_event_type, wait_event, query
from pg_stat_activity
where state != 'idle'
order by now();

select l.locktype, l.mode, l.granted, a.pid, a.query, a.wait_event_type, a.wait_event
from pg_locks l
join pg_stat_activity a using (pid)
where not l.granted
order by a.pid;

select indexdef from pg_indexes where tablename='task_run';
- The first two will show who is waiting on whom; the last one ensures expected indexes exist (migrations should have created them). 6) Ensure the DB isn’t under I/O or autovacuum pressure - Deadlocks become more likely as operations slow down. Check RDS Performance Insights for top queries, storage IOPS, and long autovacuum runs. If tables are bloated, aggressive autovacuum settings for the Prefect schema can help shrink lock windows. Why this happens specifically in Prefect - Prefect’s event ingestion (TaskRunRecorder) may upsert task-run rows while API/state transitions update them. Under parallel load or multiple API replicas, these touch the same rows in different orders, which Postgres considers a deadlock (and correctly aborts one transaction). Next steps - Could you share: - Your Prefect version and Helm chart version? - Number of API/server replicas? - RDS Postgres version and instance class?
- If you can test with a single API replica and/or after upgrading to the latest 3.4.x, let me know if the deadlocks disappear or drop significantly. - If the issue persists after these steps, we can dig into the specific queries from RDS logs (with `log_lock_waits`/`deadlock_timeout` enabled) and correlate with Prefect services to see if we should further tune service concurrency or open a targeted bug upstream. References - Similar report (matches your stack): Prefect issue #17767 - Release notes: Latest Prefect releases
e
Thanks for the issue link @Marvin on a separate question, I saw prefect now has redis? Why would I use this? Will this help with increasing performance etc and are there any pre-requisites to enabling this? - Am assuming helm to deploy Prefect
m
thought for 87.1 seconds
Great question — yes, Prefect 3 has a Redis integration, and there are two main ways you might use it: What Redis is used for in Prefect 3 - Server messaging/event bus backend: By default, Prefect’s server uses an in-memory message bus within each API pod. Switching this to Redis gives you a shared, durable message bus across pods, which helps when you run multiple API replicas and want better failover, at-least-once delivery, and smoother handling of spikes. - Workflow integration: You can also use Redis directly from your flows/tasks (e.g., simple caching, pub/sub, queues) via the prefect-redis integration. Will Redis improve performance? - When it helps: - Horizontal scaling: If you run multiple API replicas, Redis avoids each replica having its own isolated in-memory queue, enabling coordinated consumption and reducing duplicate work. - Smoothing spikes: Redis buffers bursts of events/messages so the server can process them steadily, which can reduce “service loop took longer than interval” warnings. - Resilience: Messages survive pod restarts, improving reliability during deploys or node churn. - What it won’t do: - It won’t replace Postgres or eliminate DB deadlocks by itself. The
task_run
writes still go to Postgres. Redis can reduce pressure indirectly (fewer retries, smoother throughput), but deadlocks from concurrent inserts/updates still need DB-side tuning and/or adjustments to Prefect service concurrency. Prerequisites - A Redis instance reachable from your cluster (e.g., AWS ElastiCache for Redis or a self-managed chart). Redis 6+ is typical; ensure auth/TLS as needed. - Network and capacity: Size memory for your expected event/message backlog; consider HA (replication/sentinel or managed service). - Make sure the Prefect server image has the redis integration installed (the
prefect-redis
package). If you’re building a custom image, add
pip install prefect-redis
. Helm deployment notes - You can bring your own Redis (recommended for prod) or deploy a Redis chart alongside Prefect. - In your Prefect server values, you’ll configure the messaging backend to Redis and provide the Redis connection URL via environment variables/settings. - Exact settings depend on your chart version — I’m happy to draft a
values.yaml
snippet once you share which Helm chart/version you’re using. The basic idea is: - Set the server’s messaging backend to Redis - Provide the Redis URL (including auth/TLS parameters) - Ensure the server pods have
prefect-redis
available Docs to review - Redis integration overview: prefect-redis integration - Helm deployment guide (for wiring env/settings in your chart): Prefect Helm deployment If you can share: - Your Helm chart name/version - Whether you plan to run multiple API replicas - Whether you’re using a managed Redis (e.g., ElastiCache) or want to run it in-cluster …I can provide a concrete
values.yaml
example and call out any settings you’ll want for production (auth, TLS, resource sizing, and any Redis Streams options).
e
So out of interest, can this be used to improve performance then or not? @Marvin
m
thought for 15.6 seconds
Short answer: sometimes. Redis can improve Prefect server performance, but only for the messaging path and mainly when you’re scaling. When Redis helps - You run multiple API/server replicas: Redis provides a shared, durable message bus so replicas coordinate instead of each having isolated in-memory queues. - You see bursts of events/logs: Redis buffers spikes and smooths processing, reducing backpressure and “loop took longer than interval” warnings. - You want better resilience: messages survive pod restarts, reducing retries/duplicates that can add load. What it won’t change - It won’t speed up Postgres queries or eliminate DB deadlocks;
task_run
writes still go to Postgres. - It won’t make workers execute flows faster. - If you run a single API pod with modest throughput, you’ll likely see little to no performance gain. Prereqs (Helm) - A reachable Redis (managed or in-cluster), sized for your expected event volume and with auth/TLS as needed. - Ensure the server image includes
prefect-redis
. - Configure the server to use Redis for messaging in your Helm values. Docs - Redis integration overview: prefect-redis integration - Helm deployment guidance: Prefect Helm deployment Rule of thumb: If you’re running multiple API replicas or experiencing event/log spikes, Redis is likely to improve throughput and reliability. If you’re single-replica and stable load, it’s mostly added ops complexity with limited performance upside.
e
You run multiple API/server replicas I have 2 server pods, is that what you mean @Marvin?
m
thought for 13.0 seconds
Yes — two Prefect server/API pods count as “multiple replicas.” With 2+ API pods: - Redis can help by providing a shared, durable message bus so both pods coordinate instead of each keeping its own in‑memory queue. - This usually improves throughput/resilience during spikes and deploys. - It won’t fix Postgres deadlocks or speed up workers; DB writes still go to Postgres. If you want, share your Helm chart/version and I’ll give you an exact values.yaml snippet to enable Redis (and note any image/package prerequisites). Doc link for context: prefect-redis integration
e
@Marvin Redis can help by providing a shared, durable message bus so both pods coordinate instead of each keeping its own in‑memory queue. How is this beneficial exactly?
m
thought for 20.5 seconds
Great follow-up. Here’s why a shared Redis message bus helps when you have 2+ API/server pods: - One shared queue instead of per-pod queues - Without Redis, each pod has its own in-memory bus. If pod A receives a burst of events, they’re stuck on A even if pod B is idle. - With Redis, both pods pull from the same stream, so work is naturally load-balanced. - Durability and redelivery - In-memory messages are lost if a pod restarts. - With Redis (via Streams + consumer groups), messages are persisted until acknowledged. If a pod dies mid-batch, another pod will pick up the pending entries. - Smoother spikes and backpressure - Redis buffers bursts so your services process steadily instead of timing out their loops. - This usually reduces those “service loop took longer than interval” warnings. - Fewer fallbacks/polling - When pods don’t share a bus, services may fall back to DB polling to “discover” missed events. Redis reduces the need for that, which can lower DB load and contention. - Safer rolling deploys - You can bounce pods without losing in-flight messages; the other pod keeps draining the stream. Concrete example - A surge of task state changes land on pod A. With in-memory messaging, A builds a backlog while B sits idle; if A restarts, some messages are lost. With Redis, both A and B consume from the same stream, halving the backlog, and if A restarts, B continues processing and nothing is dropped. What it does not change - It doesn’t replace Postgres. Task/flow state still writes to the DB, so Redis won’t directly fix deadlocks or DB performance issues. - Worker execution speed is unchanged. If you’re running 2 server pods, Redis typically improves throughput and resilience by coordinating work across them. If you want, share your Helm chart/version and I can provide an example values.yaml to enable the Redis backend. Docs: prefect-redis integration