Hello Everyone! I am starting playing with prefec...
# ask-community
s
Hello Everyone! I am starting playing with prefect and I think I might be doing something wrong. I have a self-hosted prefect in a k8s cluster in AWS EKS. I am testing with a simple flow like this one.
Copy code
@flow
def hello():
    time.sleep(300)
    print("hello world")
I am packaging my flow into a python:3.11-slim, with just this code and pip installing prefect dependency, trying to keep things as minimum as possible. We have a deployment that we set to use our docker image. --- When executing this in our cluster, Everytime we run a deployment we get a new pod executing the flow. Every pod takes 150-200MB of RAM (without counting the prefect-server + prefect-worker pods) If I need to run for example 10K concurrent jobs, I will have 10K pods x 200Mb RAM each... => 2000000Mb RAM. It's something we can not afford. Are we doing something wrong? Can we do our planned workload in another why it's gonna use less resources? We are starting to think prefect it's not for our use case: We just want to run a bunch of small jobs, and benefit from prefect concurrency management, but we can not afford every single task has a starting memory footprint of 150-200MB
Let's say 50 concurrent tasks for easy numbers. For those I need to have a minimum of 10GB RAM. We are an an early stage startup, and for us that will turn expensive. I have the impression that having Celery managing this workload for us can reduce significantly the memory footprint. Obviously we will loose the fancy dashboard and observability, but this specific workload we have are just background jobs
k
s
@Kevin Grismore thanks for you answer, we've investigated background tasks but it seems it's not what we exactly need. We are thinking to use in our kubernetes process workers, instead of kubernetes workers. I elaborate here what we are thinking. I would love to have your opinion: https://prefect-community.slack.com/archives/CL09KU1K7/p1762429122837369