Hi, I'm trying to understand what happens if the agent running a flow is interrupted (more specifically if there is a way to gracefully stop it). From a simple experiment of a 60 seconds long flow, that I interrupt the agent running it, the runs remain in the UI even after re-launching the agent:
(putting the screenshot in the threat)
the "cancelling..." is one run I cancelled after interrupting the agent also.
after some time the heart-beats seem to cause the runs to fail (which looks OK)
05/29/2022, 10:59 PM
the agent is intended as a lightweight long-running process constantly polling for scheduled runs - your actual execution layer could be different e.g. this could be a Kubernetes cluster
regarding the behavior of what happens when agent process is interrupted: runs deployed by docker, ECS or Kubernetes agent can continue because those run as separate processes, but it's best to treat agent as a long running service and not interrupting it
Also, something worth noting: Prefect Cloud has an Automation that allows you to take action if some agent becomes unhealthy. This could be either sending a message to the Ops team or even triggering some automated process via WebhookAction