My friends - a question. Say I have the followin...
# ask-community
t
My friends - a question. Say I have the following:
Copy code
# BANE sometimes gets cauht in some stalled staTE
@task(timeout_seconds=60 * 45, retries=3)
def task_run_bane_and_aegean(
    image: Union[WSCleanCommand, LinmosCommand], aegean_container: Path
) -> AegeanOutputs:
What is the expected behaviour in the situation where this function takes longer than 45 minutes? Should this be killed by the execution engine and restarted? In practise when using the dask backend and a SlurmCluster I am finding that this is ineffective, with the task running for upwards of 8 hours before I have to step in to fix. My real question is whether
timeout_seconds
should stop the task after 45minutes, or whether the timeout being referred to is something else? Like a heartbeat health check or something