I’m running a flow where both the flow and the tas...
# ask-community
a
I’m running a flow where both the flow and the tasks have retries. For the first run the task with retries behaved as expected waiting retry_delay_seconds between each attempt. Then after the number of retries was exhausted the flow retry kicked in and waited retry_delay_seconds before trying the flow again. So far so good. The issue I have is that on the second flow attempt the task did retry three times, but didn’t respect the retry_delay_seconds and just ran the three tries one after the other really fast. Is this expected behaviour or did I land on a bug?
m
Hmm first reaction to this is that it sounds like a bug, what version of perfect are you running?
a
This is what should be running in the image:
Copy code
prefect==2.16.5
m
Are you and to reproduce the behavior locally? Or is it just when running on the image? If you have an example of the code you're running that would be super helpful as well
a
I had tried locally, but without the retry_delay_seconds just to make sure it would total the right amount of retries and that was fine. I’ll see if I can reproduce something equivalent locally and if that does it I’ll share it with you.
🙏 1
I wrote this code and it works correctly I’m not sure what would be the difference when it ran in our flow.
Copy code
from datetime import datetime

from prefect import flow, task

times = []


@task(retries=2, retry_delay_seconds = 10)
def raise_a_task_exception():
    global times
    # append the current time
    times.append(datetime.now())
    print(times)
    raise ValueError("This is a task exception")


@flow(retries=1, retry_delay_seconds=10)
def raise_an_exception():
    raise_a_task_exception()

if __name__ == "__main__":
    try:
        raise_an_exception()
    except Exception as e:
        print(times)
[datetime.datetime(2024, 4, 4, 12, 21, 55, 841183), datetime.datetime(2024, 4, 4, 12, 22, 6, 387484), datetime.datetime(2024, 4, 4, 12, 22, 17, 68533), datetime.datetime(2024, 4, 4, 12, 22, 28, 510515), datetime.datetime(2024, 4, 4, 12, 22, 39, 357650), datetime.datetime(2024, 4, 4, 12, 22, 49, 976638)]
I have this error in my logs so maybe something else went wrong. Task run ‘xyz’ received abort during orchestration: The enclosing flow must be running to begin task execution. Task run is in SCHEDULED state.
It showed as a warning.
m
That error is basically saying it the task tried to run outside of a flow run context which seems kinda odd if the flow was retried, what worker is the flow running from when you see this happen, i.e. docker, ecs, kubernetes?
a
It’s in kubernetes. We’re still with the old agents system though we’re doing work to move to workers eventually.