Alex Furrier

01/29/2021, 3:57 PM
Hello, I’m running into a memory issue with a mapped task that computes an array and then writes it to a DB. I think what is happening is that all the arrays (which can be quite large) are being held in memory. Since I’m writing these arrays to a DB I don’t need them to remain in memory after the mapped task has a Success state. I’m running these flows on a kubernetes container, and after X number of large arrays the memory request is exceeded and the pod is evicted. No other downstream tasks are dependent on the completion of the mapped tasks. Is there a way to tell Prefect to dump the task result cache after it has been mapped successfully? Pseudo code:
Copy code
def get_array_input(db_client, query):
	return db_client.query(query)

def compute_array_write_to_db(input):
	array = requests.get_huge_array()

with Flow("generate-and-write-arrays",) as generate_and_write_arrays:
	lots_of_inputs = get_array_input(db_client, query)
# Crashes after certain amount of arrays 
# due to memory limit

Josh Greenhalgh

01/29/2021, 4:11 PM
Can you
del array
after the write perhaps?

Alex Furrier

01/29/2021, 4:13 PM
That seems incredibly obvious not sure why I didn’t think of that immediately haha. Will try to implement and see if that works. Thanks!