Thread
#prefect-community
    Samuel Hinton

    Samuel Hinton

    9 months ago
    Hi all! Is anyone aware of a way of timing out a flow itself and not just the tasks? Ie Task has
    timeout
    which we can pass in, but Im currently experience some odd issues where tasks seem to be lost in dask somewhere (they are submitted to dask but never come back, never time out), and this means my flows never end. Ideally Ill try to dig into our env and dask and prefect and figure out what is causing silent untracked failure, but as an interim solution, does anyone know of a way I can say “Cancel the flows and all tasks if its been an hour since you started?”
    For some context, this is what a flow in that situation typically looks like. Startedmany hours ago, still in the process of being cancelled manually, and the even the past tasks that come from a parametrised run get sent for execution but never picked up
    Anna Geller

    Anna Geller

    9 months ago
    @Samuel Hinton if you are on Prefect Cloud, you can use Automations to set SLA on a flow - here is how it looks like:
    Samuel Hinton

    Samuel Hinton

    9 months ago
    Alas, server. But considering cloud once Orion is out and stable.
    Anna Geller

    Anna Geller

    9 months ago
    What is also relevant: • When cancelling a flow run, any actively running tasks can’t be hard-stopped when using a shared Dask cluster - instead the flow runner will stop submitting tasks but will let all active tasks run to completion. • With a temporary cluster the cluster can be shutdown to force-stop any active tasks, speeding up cancellation. So if you’re currently using an always-on Dask cluster, you can experiment with a temporary Dask cluster instead to help mitigate such issues.
    Samuel Hinton

    Samuel Hinton

    9 months ago
    Will look into it, cheers
    Also, with those SLA’s, are they exportable to code? We’re moving to infrastructure as code, and the idea of having a third party which requires significant manual input to get up and running would be a pain point on any sign off. So fingers crossed, Ifra-as-code available now or coming soon?
    Anna Geller

    Anna Geller

    9 months ago
    well, technically speaking it’s not infrastructure. But you can create Automation actions via GraphQL API instead of using the UI. Example:
    mutation {
      create_action(input: {config: {create_flow_run: {flow_group_id: "b097b505-f4b3-401d-9e69-87eccbcc0794"}}}) {
        id
      }
    }
    for you it would probably be something like this to cancel a flow run based on SLA:
    mutation {
      create_action(
        input: {config: {cancel_flow_run: {message: "Flow run cancelled because it ran longer than the duration specified in the SLA"}}}
      ) {
        id
      }
    }
    Samuel Hinton

    Samuel Hinton

    9 months ago
    Great, Ill include that in my write up, thanks for all the help @Anna Geller 🙂