Hi everyone. I see lots of examples where people a...
# prefect-getting-started
Hi everyone. I see lots of examples where people are using Pandas / SQL Alcehemy to do an extract / Load operation. What I haven’t seen are examples of how to handle data sets larger than memory for these operations. Do you advocate to running pyspark, Dask clusters or is there a mechanism to do something with ACI / ECS fargate so that a just in time just big enough worker can be launched?
hard to give recommendations without knowing the source format of the data and the destination when you want to use pandas, loading in chunks might be a good option