https://prefect.io logo
n

Nathan Low

09/12/2023, 2:24 PM
Hi All, is there any way to clean up python processes that have crashed in Prefect from the web app? I'm seeing memory use on the agent boxes climb after flows fail or crash and it seems like the only way to reclaim that memory is to rdp into the agent box and kill the python processes by hand. Any ideas or help are very welcome.
b

Bianca Hoch

09/12/2023, 4:51 PM
Hi Nathan, are the flow runs corresponding to the python processes showing up as 'Cancelled', 'Crashed', or 'Failed' in the UI?
Also, are you manually cancelling flow runs from the UI, or have any kind of automation set up that cancels a flow run if it goes over a defined SLA?
n

Nathan Low

09/12/2023, 5:05 PM
Hi @Bianca Hoch, I see mostly failed ones, mostly after time outs. I have to manually delete these runs. I don't have any automation set up for failed runs.
@Bianca Hoch anything I can try to do? Only think I can think of to fix it quickly is to reboot the machine nightly.
b

Bianca Hoch

09/19/2023, 3:06 PM
Apologies for the delay here! So it sounds like a majority of the python processes are still running even after the timeout is exceeded, and the flow run is marked as failed?
• To enforce the timeout, are you using the
timeout_seconds
kwarg? • What sort of infrastructure are you using for the agent? • What version of prefect are you using? (output of
prefect version
n

Nathan Low

09/19/2023, 6:19 PM
Thanks Bianca: Haven't tried the timeout_seconds kwarg, I'll give that a shot thanks.
Arg enter sends. Ok, Infra is all local, all windows. Version is 2.10.10.