ESmith
09/29/2025, 9:22 AMERROR: deadlock detected
DETAIL: Process 13237 waits for ShareLock on transaction 40157444; blocked by process 13098.
Process 13098 waits for ShareLock on speculative token 1 of transaction 40157445; blocked by process 13237.
Process 13237: INSERT INTO task_run (flow_run_id, task_key, dynamic_key, flow_run_run_count, empirical_policy, task_inputs, tags, labels, name, run_count, expected_start_time, total_run_time, id, created, updated) VALUES ($1::UUID, $2::VARCHAR, $3::VARCHAR, $4::INTEGER, $5, $6, $7::JSONB, $8::JSONB, $9::VARCHAR, $10::INTEGER,
Process 13098: UPDATE task_run SET state_id=$1::UUID, state_type=$2::state_type, state_name=$3::VARCHAR, state_timestamp=$4::TIMESTAMP WITH TIME ZONE, updated=now() WHERE task_run.id = $5::UUID AND (task
Any idea whats actually causing this and how to resolve?Marvin
09/29/2025, 9:22 AMMarvin
09/29/2025, 9:27 AMtask_run
writes.
What’s happening
- You’ve got two concurrent transactions touching the same task_run
row:
- one is inserting/upserting a task run record (from the TaskRunRecorder/event ingestion path)
- another is updating the state on that same row
- The “speculative token” in your log means Postgres is doing INSERT … ON CONFLICT while another session is updating the row; they end up waiting on each other and Postgres aborts one to resolve the cycle.
- This aligns exactly with an ongoing report in Prefect 3.x: prefecthq/prefect#17767. Your snippet (INSERT into task_run
+ concurrent UPDATE of task_run
) matches the deadlock graphs posted there.
- The warnings about CancellationCleanup running longer than its loop interval are likely a symptom of DB contention (queries waiting on locks or slow I/O), which increases the window where deadlocks can occur.
Immediate checks and mitigations
1) Confirm environment details (this helps us narrow it down)
- Prefect version and Helm chart version
- Number of API/server replicas
- RDS Postgres version and instance size / storage type
- Any connection poolers (e.g., PgBouncer) in front of RDS
- Typical peak concurrency (number of simultaneous flow runs / workers)
2) Try reducing concurrent writers to isolate the issue
- If you’re running multiple Prefect API pods, scale down to a single replica temporarily. This will reduce the number of processes simultaneously inserting/updating task_run
rows and often eliminates the deadlock entirely. If that solves it, we can discuss safe ways to reintroduce concurrency.
3) Upgrade to the latest 3.x
- We’ve made incremental improvements around event recording and services since 3.3.x. If you’re not on the latest 3.4.x, upgrading is recommended.
- Release notes: Latest Prefect releases
4) Turn on Postgres-level visibility and shorten lock waits
- In your RDS parameter group:
- log_lock_waits = on
- deadlock_timeout = '200ms'
(or ~200–500ms)
- Optionally set timeouts so lock waits don’t snowball:
- lock_timeout = '2s'
- statement_timeout = '60s'
- This will produce clearer deadlock reports in the RDS logs and keep the system from getting stuck behind long lock waits. Prefect will generally retry failed statements as part of its operations, but we want to confirm impact in your environment.
5) Check for active blockers when this happens
Run these when you see the warnings/deadlocks to capture what’s blocking:
select now(), pid, usename, state, wait_event_type, wait_event, query
from pg_stat_activity
where state != 'idle'
order by now();
select l.locktype, l.mode, l.granted, a.pid, a.query, a.wait_event_type, a.wait_event
from pg_locks l
join pg_stat_activity a using (pid)
where not l.granted
order by a.pid;
select indexdef from pg_indexes where tablename='task_run';
- The first two will show who is waiting on whom; the last one ensures expected indexes exist (migrations should have created them).
6) Ensure the DB isn’t under I/O or autovacuum pressure
- Deadlocks become more likely as operations slow down. Check RDS Performance Insights for top queries, storage IOPS, and long autovacuum runs. If tables are bloated, aggressive autovacuum settings for the Prefect schema can help shrink lock windows.
Why this happens specifically in Prefect
- Prefect’s event ingestion (TaskRunRecorder) may upsert task-run rows while API/state transitions update them. Under parallel load or multiple API replicas, these touch the same rows in different orders, which Postgres considers a deadlock (and correctly aborts one transaction).
Next steps
- Could you share:
- Your Prefect version and Helm chart version?
- Number of API/server replicas?
- RDS Postgres version and instance class?Marvin
09/29/2025, 9:27 AMESmith
09/29/2025, 9:36 AMMarvin
09/29/2025, 9:36 AMMarvin
09/29/2025, 9:37 AMtask_run
writes still go to Postgres. Redis can reduce pressure indirectly (fewer retries, smoother throughput), but deadlocks from concurrent inserts/updates still need DB-side tuning and/or adjustments to Prefect service concurrency.
Prerequisites
- A Redis instance reachable from your cluster (e.g., AWS ElastiCache for Redis or a self-managed chart). Redis 6+ is typical; ensure auth/TLS as needed.
- Network and capacity: Size memory for your expected event/message backlog; consider HA (replication/sentinel or managed service).
- Make sure the Prefect server image has the redis integration installed (the prefect-redis
package). If you’re building a custom image, add pip install prefect-redis
.
Helm deployment notes
- You can bring your own Redis (recommended for prod) or deploy a Redis chart alongside Prefect.
- In your Prefect server values, you’ll configure the messaging backend to Redis and provide the Redis connection URL via environment variables/settings.
- Exact settings depend on your chart version — I’m happy to draft a values.yaml
snippet once you share which Helm chart/version you’re using. The basic idea is:
- Set the server’s messaging backend to Redis
- Provide the Redis URL (including auth/TLS parameters)
- Ensure the server pods have prefect-redis
available
Docs to review
- Redis integration overview: prefect-redis integration
- Helm deployment guide (for wiring env/settings in your chart): Prefect Helm deployment
If you can share:
- Your Helm chart name/version
- Whether you plan to run multiple API replicas
- Whether you’re using a managed Redis (e.g., ElastiCache) or want to run it in-cluster
…I can provide a concrete values.yaml
example and call out any settings you’ll want for production (auth, TLS, resource sizing, and any Redis Streams options).ESmith
09/29/2025, 9:40 AMMarvin
09/29/2025, 9:40 AMMarvin
09/29/2025, 9:41 AMtask_run
writes still go to Postgres.
- It won’t make workers execute flows faster.
- If you run a single API pod with modest throughput, you’ll likely see little to no performance gain.
Prereqs (Helm)
- A reachable Redis (managed or in-cluster), sized for your expected event volume and with auth/TLS as needed.
- Ensure the server image includes prefect-redis
.
- Configure the server to use Redis for messaging in your Helm values.
Docs
- Redis integration overview: prefect-redis integration
- Helm deployment guidance: Prefect Helm deployment
Rule of thumb: If you’re running multiple API replicas or experiencing event/log spikes, Redis is likely to improve throughput and reliability. If you’re single-replica and stable load, it’s mostly added ops complexity with limited performance upside.ESmith
09/29/2025, 9:42 AMMarvin
09/29/2025, 9:42 AMMarvin
09/29/2025, 9:43 AMESmith
09/29/2025, 9:51 AMMarvin
09/29/2025, 9:51 AMMarvin
09/29/2025, 9:52 AM