but in the task 2, I need to do post-processing on...
# prefect-community
but in the task 2, I need to do post-processing on the selfsame data structure as outputted by task 1.. in airflow it would seem this is a no-no without persisting it somewhere first.. I guess just wondering: is there is a general preference for mixing purposes and keeping tasks orthogonal?
Hey @Chris Hart! What you’re describing is called “dataflow” and it’s a first-class operation in Prefect. Your preprocessing task 1 can directly pass its out output to your downstream task 2.
There are more advanced cases to consider, such as configuring a
to automatically serialize the passed data in the event of a task failure or retry (so it can be retrieved at a later date without running the preprocessing task again, for example). If that’s interesting, feel free to shoot us a note and we’ll help you get set up.
ok sweet thanks! I'm actually doing that already but had a moment of self doubt about them operating on the same thing because "idempotent all the things" or whatever.. thanks for clarifying that it's encouraged
👍 1
Awesome — yeah, “idempotent all the things” is definitely good theory in general, but it’s usually really hard in practice. Prefect doesn’t require idempotency.