<@U02GE39K551> starting a discussion thread for <h...
# ask-community
l
@alex starting a discussion thread for https://github.com/PrefectHQ/prefect/issues/18933
🙌 1
We currently are horizontally scaling the prefect server, but we don't use redis and we aren't super keen on configuring it. What do you think about some version of this?
/get_scheduled_flow_runs
(we can make a new endpoint i.e
/get_assigned_scheduled_flow_runs
) which gives ownership to the worker and reassigns any flows given to a dead worker (identified by the heartbeat)
Copy code
UPDATE flow_run
SET worker_id = 'xxx'
WHERE id IN (SELECT id FROM flow_run LEFT JOIN worker_heartbeat on worker_id = worker_heartbeat.worker_id WHERE flow_run.status = 'scheduled' or (flow_run.status = 'pending' and worker_is_alive = false ) LIMIT 100);
This has the added benefit of letting workers horizontally scale without stepping on each other's toes - something we are having trouble with in K8S.
a
I think you're going to run into issue eventually if you scale horizontally without using Redis. We use Redis for caching and event ordering when recording task run state, so you can run into DB deadlocks if you run multiple servers without Redis. I think keeping this info in the DB will put a lot of strain on the DB at high volumes. The polling query is already expensive, so I'm wary of adding more to it. If I remember correctly, you're regularly running thousands of flows at once, so I don't think this will scale to that level. I really think that if you want to operate at that scale, you'll need to use Redis. Can you elaborate more about workers stepping on each other's toes?
l
I wasn't aware that Redis could help with DB load so that makes for a compelling argument. We will work on setting that up this week and then I will proceed with implementing your proposal. For the workers, we see these logs. Essentially, multiple workers poll the server and get a similar set of flow runs to schedule. They then compete to mark the flows as pending and submit them to K8S. This problem is worse, the more workers you have.
Copy code
Worker 'KubernetesWorker 0effa744-0226-4eb0-b718-8f016beac43f' submitting flow run 'b21c392d-3f0a-4ede-b6d3-1756ffc54716'
Worker 'KubernetesWorker 6de4ecc3-90d7-45ef-9ca5-8df8bd924c1e' submitting flow run 'b21c392d-3f0a-4ede-b6d3-1756ffc54716'
Aborted submission of flow run 'b21c392d-3f0a-4ede-b6d3-1756ffc54716'. Server sent an abort signal: This run is in a PENDING state and cannot transition to a PENDING state.
Creating Kubernetes job...
@alex we tested a Redis setup for ourselves and we do have some concerns. We did observe that CPU load on the DB was reduced by about a factor of 2, however the redis instance was at 100% CPU usage even for a subset of our entire workload (1.5K deployed flows). We tried using a Redis cluster, but we got CrossSlot exceptions as Prefect does not support sharded Redis. We are concerned that Redis is unable to scale to meet our demands with the current setup. Is our understanding correct or are we missing something? If our understanding is correct, using Redis to solve this ticket will probably not help us.
a
What's the size of the Redis instance that you're using?
l
we used the largest available on GCP (XLARGE_HIGHMEM) which is 8 cores, 58GB of memory. memory usage was extremely minimal by comparison.
@alex bump. any path forward? I would love to start work on the issue next week, but we would like some confidence that Redis is the right choice for us
a
That level of CPU load is surprising. What's the read/write volume look like when the Redis CPU is at 100%?
l
Screenshot 2025-10-07 at 2.36.58 PM.png
100:1 write to read
the first bump is 1/3 of our load, the second bump is our entire load
a
How many server replicas are you running? I'm wondering if we can improve our Redis connection handling to reduce some of the load.
l
We have autoscaling on with each server allocated 1 cpu and we can spin up to 200 servers/CPUs in one run
a
Ah, the received connection doesn't seem that wild then. We could look into clustering support. We use Redis streams for events, and I'm not sure if that's supported in clustering. All that being said, the execution time doesn't look too worrying despite being at max CPU.
l
I agree it performed quite well, but we are interesting in scaling farther past our current point 🙂 We are not Redis experts, so we are unsure what will happen as we keep pushing it further. Based on some rudimentary research, it doesn't seem like clustering prohibits streams, one just needs to be careful about multistream reads. If clustering support was added, that would make us a lot more comfortable moving forward.
We have thought about this some more with the team. We are not comfortable running Redis at 100% CPU without any scaling guarantees and it does not sound like clustering support is a priority on the roadmap any time soon. We have decided to pursue implementing a solution outside of the framework, so I will unfortunately have to withdraw from the implementation of this fix at this time.
a
I definitely think it would be worthwhile to support Redis clustering for high-scale use cases like yours. Any chance you'd be will to help enable cluster support for
prefect-redis
?
l
I would be interested in taking a look. As I understand, there are a lot of "CrossSlot" errors when using a Redis cluster, which implies that a lot of the redis queries will have to be rewritten to not query across instances in a cluster, which may be substantial work.
a
Yeah, the Lua scripts will probably require the most work. I might have some time to dig into that this week.