Hey all, I’ve been looking into Prefect as a potential tool for a streaming data pipeline, and I feel like I might be misunderstanding something about the pricing model, so I’m wondering if someone can help me sort it out.
After reading [this blog post](
https://www.prefect.io/blog/you-no-longer-need-two-separate-systems-for-batch-processing-and-streaming/), I was pretty excited about trying Prefect for this use case. When I looked at the pricing page, I saw that the unit being used to measure usage was task runs. So I guess I have two questions about that:
1. Is it fair to say that an implication of paying for task runs is that one would want to minimize the number of tasks in their flow, so as to get the most “bang for their buck?” This seems counter-intuitive to me; I’m on Page 3 of the Prefect Docs, “Thinking Prefectly”, and one thing I’m pretty sure I know by now is that smaller and more discrete tasks are better. Plus, that blog post (rightfully) pointed out that a Flow is better orchestrated by breaking up the logic into multiple tasks as needed.
2. Is it also fair to say that given that a streaming data pipeline pulls events from a stream at some chosen interval, and presumably each pull constitutes at least one “task run”, the cost of the Prefect job is proportional to the rate at which data is processed? For example, is running a Flow which is efficiently able to pull and process batches of messages every 2 seconds going to be twice as expensive as a Flow which is pulling and processing new batches every 4 seconds? (by the way, when I plug the number of monthly task runs equivalent to continuously pulling batches of messages every 2 seconds into the Pricing calcuator, it comes out to… Contact Us 😄)