Hi everyone. I see lots of examples where people a...
# prefect-getting-started
c
Hi everyone. I see lots of examples where people are using Pandas / SQL Alcehemy to do an extract / Load operation. What I haven’t seen are examples of how to handle data sets larger than memory for these operations. Do you advocate to running pyspark, Dask clusters or is there a mechanism to do something with ACI / ECS fargate so that a just in time just big enough worker can be launched?
a
hard to give recommendations without knowing the source format of the data and the destination when you want to use pandas, loading in chunks might be a good option