Thread
#prefect-community
    Robin

    Robin

    2 years ago
    Dear community, today I started playing around with
    max_retries
    and
    retry_delay
    and I got the following error:
    /opt/prefect/healthcheck.py:149: UserWarning: Task <Task: copy_storage> has retry settings but some upstream dependencies do not have result types. See <https://docs.prefect.io/core/concepts/results.html> for more details.
      result_check(flows)
    It's a
    mapped_task
    . I read the documentation about result types but that did not answer several questions: • What might go wrong if I don't set a result type? • If I understood correctly, the result type can be either
    flow
    or
    task
    . I think I want that the result type is flow, since each task should be only run once (if successful) so the result does not change during the flow. Is that correct? • Should I therefore do something like
    with Flow("creative_name", result=flow_result)
    ? And what should I set
    flow_result
    to? Bests Robin
    nicholas

    nicholas

    2 years ago
    Hi @Robin - not persisting results can lead to scenarios where even though you’ve specified retries, a retrying task has no way to access its upstream data dependencies, since they may not be in memory. Results and caching can enable the workflow you described, where each task is run just once, unless it fails. As for which result type you choose, at a high level that’s mostly determined by the granularity of the results configuration you want to achieve.
    Robin

    Robin

    2 years ago
    OK, I think I understood this, but I am not sure... Could you give one concrete example (or the link to one) that should work?
    nicholas

    nicholas

    2 years ago
    The infrastructure is a more important question here than the actual code. Imagine a distributed Dask cluster running a mapped pipeline in parallel: if one of the mapped tasks fails and the node where it was running has been cleaned up or reprovisioned (as might be the case with a retry delay), the node that picks up the retry might be different than the one that failed. In this case, the node needs to be able to access the results of the upstream so that it can rerun with its inputs.
    Robin

    Robin

    1 year ago
    OK got it, it works!