Hi Prefect team, in the current one, I can't do anything with the prefect constantly hanging. I need help. First, stop all running tasks in the scheduler, I attach a screenshot below. Secondly, to deal with the problem, fix and prevent the recurrence of this situation.
The prefect is located on local servers, in docker, not in a cloud service. Tasks are performed in separate projects. Four agents have been launched that interact with GitLab. The first problem is that the agent freezes while performing the task, the monitoring runs out of RAM. In the task manager, in the docker "top", there are many python *.py launches, where parent ID =-1, kernel, docker container
I checked, each Flow has the label "PROD", as well as the five agents running. According to DevOps, Prefect's task scheduler tries to launch the next task of the same Flow after some time. If you run all agents, all these jobs will be simultaneously launched in this queue for all available agents of the given label. As a result, the Agents will stop working when the resources run out, since the release on DWH is blocked by the subsequent launch of the flow
04/04/2022, 5:42 PM
Just confirming, you are saying running concurrent flows is causing the issue? I think concurrent flows would still be picked up by the agent though, which they aren’t in the picture. Or are the flows causing the agent to die?
04/05/2022, 6:45 PM
Flows dies when run on the same agent at the same time. When an agent is in a state of resource shortage, switching to another agent does not occur.
There is an idea, try to name three agents differently, indicate in the flow the possibility of working on any of them, I think that it will turn out to launch two flows on different agents
04/05/2022, 8:27 PM
Am re-reading this. Just so you know, there is a cancel button for all the currently scheduled flows in the main dashboard. There is something in Server to prevent agents from picking up the same flow, but this mechanism is more robust in Prefect Cloud. Are you using Cloud or Server? The Lazarus process should also reschedule them if they were not submitted.
Yes though Prefect is not designed to be aware or the resources available on an agent. If there are two agents that can pick up a flow, there is no load balancing innately
04/06/2022, 8:14 AM
I use the server. I did not find a button to cancel all scheduled tasks. Can you be a little more specific about where this button is located?
04/06/2022, 2:07 PM
this was added in 0.15.something you can click it.