Hi everybody, did anybody using local agent notice...
# ask-community
l
Hi everybody, did anybody using local agent notice any interesting spikes in memory usage? We use local agent and Dask executor. Our load is not that intensive at the moment - we run ~30 Flows during that spike. Regularly, we run into situation when the local agent's memory consumption spikes from ~ 150MB to 1.5-2.1GB. Has anybody had similar experience? At the moment, we're diving into debugging and profiling the issue so I'll post our findings here, if they turn out to be relevant.
a
@Lukáš Polák how big are your Flows? Are you using pickle-based storage? If some objects in your Flow are big and it gets pickled, and then read by the agent at runtime, perhaps this can increase the memory consumption on the agent? let’s see whether anyone from the community had the same issue
l
actually, we dont send complex data structures between tasks - only basic values. All larger objects are passed between the tasks as S3 keys. As Flow storage, we use Module storage. Based on our measures, each flow spawns ~ 100MB per process. Hence, we end up with about 2.1GB when running all the flows in parallel 😕 Is this expected?
a
Thanks, I’ll look at how Module storage handles memory
@Lukáš Polák I did not find anything suspicious in Module storage regarding memory. I will ask around what can we even look at to investigate. Are you sure those memory spikes come from Prefect? Could we perhaps double-check which processes are consuming the resources?
m
Hi @Anna Geller, we're pretty certain that the spikes come from Prefect (sub)processes – we've instrumented the LocalAgent by periodically inspecting
self.processes
with
psutil
and noticed that the overhead is ~100MB per flow (since the agent creates a new process using Popen that launches a Python interpreter with
prefect execute flow-run
)
a
Thank you @Michal Baumgartner for checking that. I don’t know what might be causing this. I will ask the team.
@Marvin open “High memory consumption by a LocalAgent with Module storage and DaskExecutor”
@Lukáš Polák and @Michal Baumgartner can you run
prefect diagnostics
and paste the output here? it may be helpful to reproduce the issue
m
thanks! here's the diagnostics output
Copy code
{
  "config_overrides": {},
  "env_vars": [
    "PREFECT__SERVER__HOST",
    "PREFECT__BACKEND"
  ],
  "system_information": {
    "platform": "Linux-5.10.25-linuxkit-x86_64-with-glibc2.28",
    "prefect_backend": "server",
    "prefect_version": "0.14.22",
    "python_version": "3.9.5"
  }
}
a
ty!
z
Hey @Lukáš Polák -- are you using a local dask cluster or a remote one?
m
we're using a remote dask cluster, deployed on k8s – I've updated the issue with more details on Github
z
Thanks! I'll plan to correspond there since it's more discoverable.
upvote 1
👍 1