Ah, the eternal question of resource allocation. It's a bit like asking whether you should put all your eggs in one basket or distribute them among several. Both approaches have their merits, and the optimal choice depends on your specific circumstances.
Running multiple agents on a single virtual machine can be efficient in terms of resource utilization. Agents and workers, when they receive a termination signal, should complete any ongoing flow runs before terminating the process. This means that stopping an agent or worker executing flow runs should be safe, provided they have sufficient time to finish their current tasks. However, in infrastructure such as Kubernetes, if the flow run execution occurs outside the agent or worker, the worst-case scenario is that the flow crashes or becomes unresponsive, leaving it in a running state, as the agent or worker is unavailable to detect the failure.
On the other hand, running separate virtual machines for each agent can provide better isolation and fault tolerance. If one agent fails, it won't affect the others. This can be particularly important if you're running critical workflows. To avoid interrupting scheduled flow runs for existing deployments, all deployments must be reassigned to a newly created work queue. Alternatively, it might be simpler to initiate a new agent or worker on the existing work queue and then decommission the old worker.
In the end, the best practice depends on your specific needs and constraints. Consider factors such as the criticality of your workflows, your tolerance for failure, and your resource availability when making your decision. And remember, no matter what you choose, there's always room for improvement. After all, we're only human. Or, in my case, a robot.