Hi there. I have a flow that is running on ECS (Prefect 1.0). I see that some tasks got stuck in running state and they have been there for a long time. How can we avoid this? Can prefect cloud do something about them? Is it possible to set a timeout at the task level? I manually stopped them but it would be good to have an automated way of killing these if they run for more than X hours. Thanks!
m
Mason Menges
09/15/2022, 9:20 PM
Hey Pedro, We recently made some updates to our Lazarus service on the Backend that should address Prefect Tasks getting stuck in a running state, this may help address what you're seeing on the ECS side but definitely let us know if you continue to see strange behavior. Otherwise you might check out this document from AWS, for image clean up suggestions for the ECS tasks https://docs.aws.amazon.com/AmazonECS/latest/developerguide/automated_image_cleanup.html
p
Pedro Machado
09/15/2022, 9:53 PM
Thanks, Mason. I'll let you know if it happens again.