Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

My general pipeline flow is "fetch data" -&gt; do various operations on data -&gt; persist results. My first instinct was to split those stages up into separate (potentially mapped) tasks, but I'm worried about the overhead of passing large data blobs between tasks (especially if I move to running things on a cluster of some kind in future, where those tasks could end up running on different machines). Am I right to be worried about that? Is there a best practice?

Hey Christoper, I'm going to defer to <https://prefect-community.slack.com/archives/CL09KU1K7/p1669135807868289?thread_ts=1669133993.174669&amp;cid=CL09KU1K7|Ryan's post> about this since he provided some very useful information on passing large data objects between tasks with Prefect versions 2.6.0 and up. Please reach out here if you have any additional questions on best practices. :smile: