Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

Beginner’s Question:

• _I have a number of flows which ingest one or more jsons and transform them into a single dataframe_:
• The flows are currently wired to write their output in a target path, that is parameterized (via a block, but this is still TBD).
• But I also like to unit-test and in some use cases do not require to land the data at all.
In effect I have reusable flows.

What is the best pattern for this? Optionally persist as an input parameter, optionally return the DF or None? Control behavior via blocks?

Maybe have a parameter (let's say it's called `target_path` ) to the flow that controls output destination, but the param is optional so output writing logic is behind `if target_path:`

And then you could have multiple deployments of that flow, with `target_path` set/not set for convenience in the UI

I am trying having the tasks write to an AbstractFileSystem, and then I can use S3 for production, and a memory based fs for unit testing. This seems to work!