https://prefect.io logo
r

Robert Banick

08/01/2023, 8:51 PM
Hi all, we’re diagnosing problems with an agent hosted on AWS ECS that’s regularly becoming unresponsive, causing our queues to go unhealthy. Has anyone else struggled with an agent that “falls asleep” regularly? Is there a possibility that bombarding an agent lots of logs to route to the Prefect UI can overwhelm it and cause it shut down down?
w

Will Raphaelson

08/02/2023, 8:54 PM
Hey Robert - we can also talk about this when we link up next week - but have you tried upgrading to workers? if there is a critical issue in agents obviously we’ll fix it, but future investment will be on the worker side. If you have any logs or leads on the “falling asleep” agent, DM me and we can dig in?
r

Robert Banick

08/02/2023, 8:56 PM
We haven’t had time yet to investigate the logs issue but will next time it happens. Upgrading to workers is indeed my go-to response here. I’m a little leery of changing something so critical that otherwise works just fine. To be honest our initial setup journey with Prefect involved a lot of semi-undocumented gotchas and I’m therefore not too keen on change.
w

Will Raphaelson

08/02/2023, 8:57 PM
I hear you. keep me posted on how I can help (in addition to continuously improving documentation)