Hey folks! I’m wondering what would be a good patt...
# prefect-community
k
Hey folks! I’m wondering what would be a good pattern to set up long-running Prefect agents in Prefect 2.0. It looks like the
DockerContainer
and
KubernetesJob
create new containers/jobs per flow run, but what I’d ideally like to do is run a fleet of long-running workers that process from the queue. What’s the best infrastructure to choose in Prefect 2.0 to achieve this?
m
I would say it depends… how often would you actually need to process? Is it every second, every minute, …? Another factor that influences the architecture is how long the actual processing job runs.
k
Good questions! It’s typically a few dozen jobs per minute and the processing time is very short (1-2 seconds)
So for that reason, I’d like to avoid creating a new Kubernetes job per flow run, as the overhead is quite large for a very short job
m
In that case, I think a streaming engine like apache Beam, Spark Streaming, Flink, … is a better option
Or even writing it in pure Python
k
The jobs have multiple external API calls and somewhat complex internal state, which is why I wanted to use Prefect / workflow manager. Streaming engine I think is definitely not the right fit here, but pure Python + queue is a decent idea. Maybe something like SQS could work