haf
12/10/2021, 11:35 AMe2631f0e-2987-405f-bf3b-3fbb46acf90c
haf
12/10/2021, 11:37 AMAnna Geller
concurrent.futures._base.CancelledError
can result from a long-running step in computation where there is no output (logging or otherwise) to the Client
. In these cases, due to the lack of interaction with the client, the scheduler regards itself as “idle” and times out after the configured cloudprovider.ecs.scheduler_timeout
period, which defaults to 5 minutes. The CancelledError error message is misleading, but if you look in the logs for the scheduler task itself it will record the idle timeout.
The solution is to set scheduler_timeout
to a higher value, either via config or by passing directly to your cluster class constructor.
The answer is stolen from here.haf
12/10/2021, 11:46 AMhaf
12/10/2021, 11:46 AMAnna Geller
haf
12/10/2021, 4:20 PMhaf
12/10/2021, 4:22 PMbut if you look in the logs for the scheduler task itself it will record the idle timeout.There's no timeout in the scheduler logs for me.
haf
12/10/2021, 5:30 PMhaf
12/10/2021, 5:31 PMAli Abdelmotalib
01/14/2022, 1:40 PMhaf
01/17/2022, 7:50 AM