Hi… Is there a way to map over a dataframe without producing some intermediary objects that bloat memory?
Alex Rud
01/19/2021, 3:45 AM
I have an ETL thats something like:
read file to dataframe
->
iterate over data frame (non map) to produce a list of structs
->
map over structs to POST to consumer
Alex Rud
01/19/2021, 3:50 AM
this goes from 150MB file -> 300MB Dataframe -> 600MB of structs (900 MB total in RAM)… I’d like to cut out the middle man and map my POSTS over the dataframe to try to keep the mem usage down