https://prefect.io logo
a

Andrew Lawlor

02/17/2022, 5:37 PM
For a KubernetesRun, I set a memory request, but see errors when the job needs more memory. i didnt set a memory_limit. is it possible to have it scale up automatically if I need more memory than the request?
a

Anna Geller

02/17/2022, 5:57 PM
The best way you could approach it is to balance the memory request and memory limit based on your needs. If the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a container to use more resource than its 
request
 for that resource specifies. However, a container is not allowed to use more than its resource 
limit
. You can read more on that here and here.
I probably didn't tell you anything you didn't know so far 🙂 basically the most you can do is run your flow once with a specific limit and if this is not enough and you get OOM errors, you would need to adjust your run config and figure it out by experience. You can also do some profiling locally first by using psutil or something similar
a

Andrew Lawlor

02/17/2022, 6:03 PM
thanks for the documentation. im reviewing it. im pretty inexperienced with kubernetes so im happy with any resources. im a little confused bc i didnt set a limit, only a request, so i assumed it would scale up but that doesnt seem to be happening
a

Anna Geller

02/17/2022, 6:08 PM
correct, the memory request and limit are only used to decide on which node to place a pod and then it depends on the node's capacity
a

Andrew Lawlor

02/17/2022, 6:14 PM
oh i see. so if it places it on a node and the node ends up being too small, it wont place it on a bigger node?
a

Anna Geller

02/17/2022, 6:26 PM
so Prefect deploys your flow runs as Kubernetes jobs. Prefect submits the request to the Kubernetes API including the memory request and limit and Kubernetes uses that information to figure out on which node in the cluster to deploy that job/pod. Therefore you ideally should have a ballpark estimate on the required memory before you even deploy your flow. You could use psutil as suggested before to figure out the memory requirements for your flow. But if you don't want to deal with it and you would prefer in the worst case to pay for over-provisioned resources, you can always assign some ridiculously high memory limits 🙂 provided that your nodes have that much capacity in the first place or that you use autopilot/serverless
🤯 1
m

Matthias Roels

02/18/2022, 12:15 PM
@Andrew Lawlor indeed! Kubernetes uses resource requests/limits to schedule pods on a particular node. If you haven't set any resource limits, the node is large enough and still has resources available, you are allowed to use more memory than requested. But if that's not the case, you get an OOM.
upvote 1
a

Anna Geller

02/18/2022, 1:47 PM
a

Andrew Lawlor

02/18/2022, 2:21 PM
ok i guess i need to decide the best practice architecturally. my flow is for etl and it. does incremental loading, so it usually doesnt require very much memory. but occasionally, i will want to flush the table and reload everything (bc the upstream data changed), and that will require a lot of memory
a

Anna Geller

02/18/2022, 2:31 PM
Gotcha. Actually, Prefect flow seems to be perfect for that use case because for this run when you reload everything you could simply provide more memory in your
KubernetesRun
run config, e.g. directly from the UI:
a

Andrew Lawlor

02/18/2022, 7:38 PM
oh thats a good point. i believe that would fix my issue
👍 1
4 Views