https://prefect.io logo
a

Andreas Nord

09/07/2023, 1:04 PM
Hi! I recently experienced memory issues with running prefect + pandas. A subflow gets stuck in running and I received the crash notification for the top flow:
Copy code
State message: Flow run infrastructure exited with non-zero status code -9.
All my tasks of the subflow that gets stuck in running succeeds, but it somehow runs out of memory before succeeding. It seems that the process of determining flow final state takes up a lot of memory when many tasks return pandas dataframes. Any suggestions?
n

Nelson Griffiths

09/07/2023, 1:11 PM
@Andreas Nord I would recommend considering a switch from pandas to polars. We run all our flows on Google cloud run and switched to polars. It runs much faster and keeps the memory a lot lower than our old flows. https://www.pola.rs/
In my experience it has worked nicely with our prefect setup for data pipelines
a

Andreas Nord

09/07/2023, 1:44 PM
I'm aware of this library but it's quite a big project to port all the code, need to a short-medium term solution
n

Nelson Griffiths

09/07/2023, 1:51 PM